ABSTRACT

There exist two different inference techniques for the classical multiple regression model, a traditional one-implemented today as the default in most statistical packages-and a modern one-available as an option in the better packages. The validity of both techniques depends on assumptions about the so-called errors in our model defined as

ei = Yi−μ(xi1,xi2, . . . , xip) , that is, the deviations between the expected value of Yi according to our model and the realisation of Yi. The first assumption is that our model is correctly specified; that is, the expectation of Yi given the covariate values x1,x2, . . . , xp should be indeed a linear function of these values. We can express this equivalently by requiring

The expected value of ei is 0 . (1)

The second assumption is that

e1,e2, . . .en are independent , (2) which requires that the value of ei for one unit does not tell us anything about the error of another unit. This assumption is typically violated if several measurements from the same unit are used in the analysis. A typical example is given by measurements on one subject at three time points during one day (e.g., before breakfast, lunch, and dinner), because the deviation from the model goes probably in the same direction on all three time points; that is, the deviation is mainly an individual effect shared by all three time points. However, as soon as we have only one measurement for each unit, there is usually little reason to have doubts about assumption (2).