ABSTRACT

Upon its initial release, the Spoken BNC2014 was made available via Lancaster University’s CQPweb server. CQPweb (Hardie, 2012) is a component of the open-source Corpus Workbench system; its design closely follows BNCweb, custom-designed for the original BNC1994. By releasing the corpus in this way, we have thus furnished scholars with many familiar affordances. These include the basic query functionality, including the search syntax, and the concordance display which is the basis for further analyses. Since the Spoken BNC2014 includes much XML markup, CQPweb uses this information to show phenomena such as utterance boundaries, overlaps, and speaker identification within the concordance and extended context views. The extensive text and speaker metadata compiled for the corpus allows CQPweb queries to be restricted to particular subsets of the corpus; these divisions can also be used as a basis for analysing distribution of query results across the corpus. Other statistical procedures available within the system are the collocation and keywords tools, which implement these very widely used corpus linguistic methods with many options to tweak the details of how the statistics are calculated. We hope that researchers will find the CQPweb system valuable in their exploration of the Spoken BNC2014.