Content analysis programs

PATH: Janda's Home Page > Methods Workshop Home Page > Content Analysis > Programs and References

Content Analysis Programs and References

The years have seen content analysis programs come and go. Many earlier ones have shown considerable staying power. Here I've listed some that I know and others that intrigue me. Most, but not all, are for Windows, and a few are free. I've tried to present the most general sites first. Many of these sources direct users to computer programs that generate "concordances" of important words, often referred to as KWIC programs. This type of analysis is so useful in political research that I isolate some KWIC sites from the pack. Go HERE for KWIC programs.

Links to sites for content analysis software: This service is provided by some dedicated souls at Georgia State University. It provides links to web sites where one can find information (often including purchasing information) regarding content analysis software as well as other types of software that are often utilized by content analysts.

Neuendorf's Content Analysis Guidebook Online: Kimberly Neuendorf teaches in the Communication Department at Cleveland State University. She prepared this extrordinarily helpful web site to accompany her book, The Content Analysis Guidebook (2001).

William Evans' Content Analysis Resources: William Evans is Director of the Center for Creative Media at the University of Alabama whose research interests include computer-supported content analysis. Though its home page is not well-designed, this site can direct you to a LOT of interesting material.

Virginia's Text Analysis Software: The Electronic Text Center at the University of Virginia Library is deeply involved in text digitizing and analysis. The Center combines an on-line archive of tens of thousands of SGML and XML-encoded electronic texts and images with a library service that offers hardware and software suitable for the creation and analysis of text. This link directs users to available programs.

Quantitative Study of Dreams

Don't be put off by the title. This site deals more generally with content analysis with these links:

More about content analysis
Information for people doing research projects or papers
Non-scientific info, for those not interested in quantitative research

CATPAC II: The vendor says that this Windows program reads any text and summarizes its main ideas; it needs no precoding and makes no linguistic assumptions. One of the leading programs in the field, it is pricey ($595) but there is a $49 student version.

Harald Klein's Text Analysis Info

Harold Klein does "Social Science Consulting" in Rudolstadt, Germany. He is a frrequent participant in content analysis list serve exchanges, and his site is very helpful. Note that Klein says that INTEXT, a DOS program, is now in the public domain. You can arrange to order it yourself.

Harold Klein's TextQuest

This site takes you to TextQuest, the Windows version of INTEXT. Note that this cost 100 Eurodollars, which is about $100 US dollars.

Diction 5.0

Roderick Hart on the Speech faculty at the University of Texas devised Diction many years ago for mainframe computers. DICTION 5.0, now a Windows program, determines the tone of a verbal message and searches a passage for five general features--such as certainty, activity, optimism, commonality, and realism. He used Diction in his 1984 book, Verbal Style of the Presidency.

The General Inquirer

Devised in the early 1960s by Philip Stone and associates, the General Inquirer launched the era of computerized content analysis. I am delighted to report that there is now an internet Java version of the program. I fed it a portion of President Clinton's 1997 Inaugural Address and produced an analysis! But note that the General Inquirer, like Hart's Diction program, analyzes style more than content.

Minnesota Contextual Content Analysis

This is a product of CL Research, not the University of Minnesota. CL Research develops computational lexicons from machine-readable dictionaries by parsing and processing dictionary definitions, with particular focus on their use in natural language processing applications such as word-sense disambiguation, discourse analysis, question-answering, and content analysis.

VBPro Content Analysis Site, by Mark Miller

VBPro is a popular set of program with its own discussion list server. Although the programs run under DOS (not Windows), they are menu driven and come with user guides. All output is in ASCII format compatible with most word processors and statistical packages. The programs include procedures to:

help prepare and clean text for futher analysis.

create lists of words in a file along with their frequency in alphabetical order or in descending order of frequency.

find and tag key words and phrases in context: sentence, paragraph, or user defined case (case context is usually news story in my work).

code various units (sentence, paragraph, or user defined case) for frequency or presence/absence of categories of selected words and phrases.

provide the coordinates for mapping terms in a multidimensional space in which the proximities are indicative of the degree to which terms co-occur.

WordStat Content Analysis & Text-Mining Module

WordStat is designed to work with the statistical and "data mining" Windows package, SimStat. Its web site lists many features, including:

Analyses text stored in several records of a data file.

Performs analyses on alphanumeric fields containing short textual information (up to 255 characters) such as responses to open ended questions, titles, descriptions, etc. as well as on longer texts stored in memo fields (up to about 64K of text)

Optional exclusion of pronouns, conjunctions, etc, by the use of user-defined exclusion lists.

Categorization of words or phrases using existing or user-defined dictionaries.

Word and phrase substitution and scoring using wildcards and integer weighting.

Frequency analysis on words, phrases, derived categories or concepts, or user-defined codes entered manually within a text

TextSmart: SPSS product for text analysis

TextSmart is an intriguing program, bought by SPSS, that does statistical analysis with words, primarily short amounts of text such as responses to open-ended questions in surveys. I own the program but have not figured out how to use it effectively, although I think it has promise.

Megaputer tools for data mining and text analysis

Megaputer offers tools for data mining and knowledge discovery in databases, semantic text analysis, and information retrieval. Megaputer's flagship products are PolyAnalyst™ and TextAnalyst™. In that regard, it appears to be similar to WordStat.

TABARI

This is the successor program to KEDS, which was created by former NU faculty member, Phil Schrodt, a quantitative IR scholar who departed for the University of Kansas some years ago. For years, KEDS was only available for Macintosh computers. TABARI runs on Macintosh computers, on PCs using the Liniux operating system, and can be run as a DOS progam under Windows. It was designed for the machine coding of international event data using pattern recognition and simple grammatical parsing. It works with short news articles such as those found in wire service reports or chronologies.

Visual Representation of Text: TextArc

A TextArc creates a visual represention of a text on a single page. It is described as "A funny combination of an index, concordance, and summary; it uses the viewer's eye to help uncover meaning." I've not tried it.

WordNet: lexical database for the English language

WordNet® was developed at Princeton by George Miller, a cognitive psychologist. It is an online lexical reference system informed by psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets. Algthough not a content analysis program, it can be helpful to content analysis.