ABSTRACT

This chapter provides an overview of current computer technology with potential applications to text analysis. Quantitative analyses of textual materials have benefitted from computing technology as long as that technology has been available. In fact, some of the earliest computers were designed specifically to support text analysis for code breaking during World War II. Although a scanned bit-mapped image itself is in machine-readable form and can be printed out in a form of high-technology copying, it must be converted into a collection of text characters to be useful for content analysis. The advantages of database technology for text applications are so great that some vendors have developed what are referred to as document management systems—systems that offer database features tailored specifically to text and related materials. In some text analyses no preprocessing is performed, leaving all extraction to be performed on the original text.