ABSTRACT

A corpus-based method for the identification of a particular type of formulaic sequence - formulaic word n-grams - is described and evaluated, with a view to determining the authorship of texts. Based on the analysis of a corpus of 100 short narratives produced by 20 authors, statistical results demonstrate that formulaic word n-grams were used distinctively between authors. However, in attempting to qualitatively attribute a text whose authorship was unknown to its correct author, the method was unsuccessful. The chapter concludes that formulaic word n-grams occur too infrequently in short personal narratives to be of practical use as a marker of authorship.