ABSTRACT

This chapter utilises the method of semantic field analysis to describe and compare various subcorpora of online question-and-answer forum texts. This is achieved through automated semantic tagging and calculation of statistical significance using USAS and Wmatrix. The use of semantic categories is particularly helpful when comparing subcorpora of relatively small sizes, as with this dataset. The chapter demonstrates a systematic approach to the comparison of corpora by using 'corpus-based comparative frequency evidence to drive the selection of words for further study'. It analyses that can be carried out in Wmatrix, a web-based corpus analysis and comparison tool developed by Paul Rayson at Lancaster University. Wmatrix is unique in its total integration with the CLAWS part-of-speech tagger and the UCREL Semantic Annotation System. The UCREL Semantic Annotation System (USAS) is framework for automatic semantic tagging of input text. USAS draws upon an extensive lexicon to assign one or more semantic tags to each word or multi-word-unit (MWU) in a given text.