Evaluating the Spoken BNC2014
This chapter evaluates and reflects on the representativeness of the completed Spoken British National Corpus 2014 (Spoken BNC2014), which comprises 11,422,617 tokens produced by 668 speakers across 1,251 transcribed recordings. After providing basic frequency information about the corpus, the chapter evaluates the target domain representativeness and the linguistic representativeness of the Spoken BNC2014. In terms of the target domain, comparison against UK population data suggests that the Spoken BNC2014 overrepresents speakers from England and underrepresents speakers from Scotland, Wales and Northern Ireland. The result of this is that the original target domain (reported in Chapter 3) is revised, so it can be said that the Spoken BNC2014 represents informal spoken English, produced by L1 speakers of British English, in England, in the mid-2010s. The corpus is found to be sufficiently representative of a range of linguistic features to be trusted as a resource for the study of informal spoken English.