Introduction | 1 | A Corpus of Formal British English Speech

ABSTRACT

The Lancaster/IBM Spoken English Corpus began in September 1984 as part of a research project into the automatic assignment of intonation undertaken by members of the Unit for Computer Research on the English Language at Lancaster University in collaboration with the Speech Research Group at IBM UK Scientific Centre. The first task for the project was to collect samples of natural spoken British English which could be used as a database for analysis and for testing the intonation assignment programs. The result is a machine-readable corpus of approximately 52,637 words of contemporary spoken British English.