Statistical aspects of literary style | 7 | Applied Statistics

ABSTRACT

The consideration of any probabilistic aspects in this problem raises conceptual issues. The use of probabilistic ideas, like standard errors, depends on the tentative working hypothesis that for certain purposes the number of kai's per sentence behaves as if it were generated by a probabilistic mechanism. To help interpret the relation between standard deviation and mean, it is a good idea to consider idealized stochastic model in which, within a work, each word has a constant probability, independently from word to word, of being 'kai'. The raw data for each work form a frequency distribution, and in fact in the source paper a number of such distributions are given corresponding to various properties. Discussion is much simplified by replacing each distribution by a single number, a derived response variable. Inspection of the frequency distribution for that work suggests that it would be unwise to put any strong interpretation on the anomalous dispersion.