ABSTRACT

Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Word rank-frequency distributions are often used as input to various authorship attribution investigations. This chapter uses advanced RapidMiner processes to fit the ZipfMandelbrot distribution to sequential fixed-size windows within a single document to create a summary of observed fluctuations for an author. These are compared with random samples of the same size from the same document. Different works by the same author as well as different authors are assessed to determine if there is a consistent pattern for individ-

Text Tools

ual authors. Some initial evidence of a consistent variation for different works by the same author is observed.