ABSTRACT

This chapter provides a non-exhaustive overview of important existing corpora, focussing on those compiled for research in corpus linguistics proper. It begins by describing the formats corpora come in, including their general representation and potential linguistic annotations on multiple levels. It then discusses the conceptual design options for corpora, distinguishing between large-scale general or national corpora, smaller specialised corpora, comparable and translation or parallel corpora, static vs. dynamic corpora and potential sub-divisions thereof. The next section describes the historical development of corpora from the early to mid-1960s to the present. This is followed by a discussion of the different options for accessing such corpora online and offline, as well as issues of usability/suitability for different research purposes, finally rounding off with a brief look into the potential future of corpora.