TextSTAT is text concordance software that runs on both Windows (XP or Vista) and Mac.

According to the TextSTAT website the Windows version "includes everything you need to use TextSTAT with Windows. It comes as a single installation file." I haven't tested this.

I've successfully run TextSTAT in macOS, but it takes a little doing: you have to download the sourcecode (written in the Python programming language) and launch TextSTAT using another application, MacPython.

Fortunately, MacPython comes packaged with macOS, so you shouldn't have to download it separately. However, according to the TextSTAT website, you won't have success with versions of Python later than 2.6. When we get to the point in these instructions when it's time to use MacPython, you'll find advice on what to do if your version is too recent.

So, let's get started.

First, download, unzip, and install TextSTAT.

On the homepage for TextSTAT, find the link to download the Python sourcecode for the application. Click to download.

You should end up with a folder named "TextSTAT2-source" in your Applications folder. Note that among the files inside this folder, there's one named "TextSTAT.pyw". Later, you'll be dragging the icon for this file into a Terminal window. If you don't know about Terminal, don't worry. You'll learn about it in a moment.

Next, download the text you'd like to analyze.

This isn't absolutely necessary — you can point TextSTAT to files that aren't on your hard drive — but it's recommended for the present tutorial.

For our example, we'll use the Gutenberg version of Thoreau's Walden. Follow the link to find it.

Notice that you want to download Walden in Plain Text UTF-8 format. Take note of where the downloaded file lands on your hard drive. (Unless you've changed the default location in your browser preferences, this should be your Downloads folder.)

It's almost time to launch TextStat using MacPython.

But first, check on your version of MacPython. You should see MacPython in your applications folder. If your Mac version of Python is later than 2.6, download and install the older version.

Now the fun begins.

Time to use the Terminal application on your Mac, where you can enter line commands that make your Mac do things without your having to navigate windows or click on icons — bypassing, in other words, your Mac's GUI (graphical user interface). Terminal is a power tool, and like all power tools it deserves your respect. If you don't know what you're doing, and you're not careful, you can hurt yourself (well, your Mac, anyway). To get a feel for some of the things you can do in Terminal, from within the Terminal application go to Help > Terminal Help > Learning about UNIX commands.

But back to the mission at hand. Find Terminal in Applications > Utilities and fire it up.

Terminal will open in a plain-text terminal window, and some text will appear in the window right away, ending with the account name you're logged into on your Mac, followed by a dollar sign. In my case, that would be ~ schacht$

As shown in the two screenshots below, you'll next want to do two things:

  1. type "pythonw" (no quotation marks) after the dollar sign
  2. drag the directory location of the file named "TextSTAT.pyw" (inside the folder "TextSTAT2-source," remember?) into the terminal window immediately following where you've typed "pythonw" (but do leave a space between "pythonw" and the location)

You may have just discovered a nice feature of the Mac GUI that you weren't aware of. If you drag a file icon into an application (as opposed to another window on your desktop) the path to the file on your hard drive — rather than the file itself — gets copied to the target spot.

Okay. Click in the Terminal window at the end of the file path, take a deep breath, then hit the return/enter key on your keyboard. Within a few seconds, two things should happen: MacPython should launch, and a TextSTAT window should open on your Mac.

Final steps.

To perform a text analysis, TextSTAT needs text to analyze, aka a "corpus."

So first, you need to add a corpus to TextSTAT. That would be your plain text copy of Walden.

Walden will need a little readying for this. The file you downloaded from Gutenberg contains some very important words about the Gutenberg project that, if included in your corpus, will unfortunately throw off your results, since the words aren't part of Walden. So open the text of Walden, scroll to the end, back up to where Thoreau's text itself ends, and delete all the Gutenberg information. Save the file.

However, respect the Gutenberg community: any version of Gutenberg's Walden that you share with others must always include the Gutenberg information at the end.

The video below shows what to do next:

  1. Double-click the "New corpus" icon, name the corpus whatever you want ("Thoreau" in the video), click "Save," then click "OK" in the Python dialog that opens up.
  2. Now double-click the "Add local file" icon, navigate to the location of Walden on your hard drive, and click "Open." This will put the path to your Walden file in TextSTAT.
  3. Very deep breath. Double-click the "Show word frequencies" icon in TextSTAT. Voilà: you should see a list of all the words in Walden in order of frequency.

  • No labels