ABSTRACT

This chapter tries to achieve two objectives: to trace the development and rough timeline of the use of corpora for pragmatic purposes, and to point out where some of the main issues and opportunities in doing corpus pragmatics lie. Corpora are collections of naturally occurring texts that are compiled according to specific design principles. General corpora are assumed to fulfil two important criteria underlying their design, those of balance and representativeness. Corpus linguistics – as a method of linguistic investigation – generally uses corpora for developing or testing linguistic hypotheses based on real-life data, thereby complementing and/or corroborating the researcher's intuition, using relatively large samples of data that can be processed quickly and efficiently by means of computer. Single word frequency or n-gram lists allow researcher to identify salient topics, recurring lexical items or chunks, or relevant grammatical constructions in corpora. One major critical issue in annotating corpora pragmatically has always been the depth and precision of speech-act taxonomies employed.