ABSTRACT

In Table 6.1 a small set of data appears giving the average vocabulary size of children at various ages. Is it possible (or indeed sensible) to try to use these data to construct a model for predicting the vocabulary size of children older than 6, and how should we go about it? Such questions serve to introduce one of the most widely used of statistical techniques, regression analysis. (It has to be admitted that the method is also often misused.) In very general terms, regression analysis involves the development and use of statistical techniques designed to reflect the way in which variation in an observed random variable changes with changing circumstances. More specifically, the aim of a regression analysis is to derive an equation relating a dependent and an explanatory variable, or, more commonly, several explanatory variables. The derived equation may sometimes be used solely for prediction, but more often its primary purpose is as a way of establishing the relative importance of the explanatory variable(s) in determining the response variable, that is, in establishing a useful model to describe the data. (Incidentally, the term regression was first introduced by Galton in the 19th Century to characterize a tendency toward mediocrity, that is, more average, observed in the offspring of parents.) The Average Oral Vocabulary Size of Children at Various Ages

Age (Years)

Number of Words

1.0

3

1.5

22

2.0

272

2.5

446

3.0

896

3.5

1222

4.0

1540

4.5

1870

5.0

2072

6.0

2562