ABSTRACT

The British National Corpus (BNC) will contain 100 million words of varied types of modern British English. It is designed to provide an unparalleled resource for the construction of dictionaries, for linguistics research, for the implementation of natural language processing systems of various kinds, and so on. This chapter is concerned with the grammatical tagging of one part of the BNC, that devoted to spoken language, and discusses how tagging software originally developed for the written language has had to be adapted.