ABSTRACT

This chapter discusses the design of the Spoken BNC2014 and provides information about the major obstacles that had to be overcome in order to construct the Spoken BNC2014, such as the affordability of the corpus, the feasibility of data collection and the range of possible uses of the dataset. In addition to a brief historical account of these issues, the chapter discusses the design decisions taken and their motivation. The chapter thus provides an important background for the Spoken BNC2014 and also highlights more general methodological considerations related to building a spoken corpus.