ABSTRACT

Textual databases are becoming major resources for research in language and linguistics. By “textual database” we mean continuous text, either written or transcribed from speech. It may be a complete text or texts, in the literary sense, or samples of text. This chapter discusses important issues in the acquisition and use of these databases. An overview of existing resources is given, followed by an examination of markup schemes and software tools. The emphasis is on tools for the ordinary working linguist and the chapter concludes with a brief assessment of what he or she can expect to achieve using these techniques.