ABSTRACT

The chapter focuses on computer-based language corpora – collections of texts that can be searched using special software – that are relatively small (compared to much larger such collections) and are specialised in some way, for example, in terms of the setting, text type, genre or topic. It begins by defining “a small corpus” and reviewing the different ways in which a corpus can be specialised with reference to a variety of existing corpora. Next, important considerations in designing and compiling a small, specialised corpus are discussed, in particular, how representativity and balance can be achieved for such smaller corpora. The final section looks at what can be learnt from small specialised corpora, illustrating this with examples from a range of corpus studies. The chapter argues that small, specialised corpora can have advantages over larger corpora due to the availability and retrievability of contextual features, which can be linked to specific language patterns.