ABSTRACT

When a model can be formed by including some, or all, of the predictor variables, there is a problem in deciding how many variables to include. The decision we arrive at will depend to some extent on the purpose we have in mind. If we merely wish to explain the variation of the dependent variable in the sample, then it would seem obvious that as many predictor variables as possible should be included. This can be seen with the lactation curve of Example 2.11. If enough powers of w were added to the model the curve would pass through every observed value, but it would be so jagged and complicated it would be difficult to understand what was happening On the other hand, a small model has the advantage that it is easy to understand the relationships between the variables. Furthermore, a small model will usually yield estimators which are less influenced by peculiarites of the sample and so are more stable.