ABSTRACT

Linguistic corpora, spoken and written, have been developed for a wide range of language studies, for instance in speech science, syntax, semantics, pragmatics, language acquisition, language dysfunction, and conversation analysis. These studies have been put to practical use in designing systems for speech synthesis and recognition and for grammatical tagging and parsing, in writing dictionaries, in analysing style and text types, in translation, in describing language development, and in treating language disorders. In some cases, the corpora were designed for a specific purpose, as, for instance, the COBUILD collection was developed specifically to provide a basis for the development of English dictionaries and learning materials. Others, such as the Brown and LOB corpora, were planned without an immediate project in mind, intending to be open to a wide range of future uses. But in all corpora, needs of users and practical constraints determine crucial decisions that might otherwise be debated indefinitely.