ABSTRACT

We begin with what is probably the main workhorse of statisticians: the linear model, also known as the regression model. It has been the subject of many books, so rather than going into the mathematics of linear models, we shall discuss mainly the statistical aspects. We begin with a formulation adapted to the generalizations that follow later. The components are

(i) a response variate, y, a vector of length n, whose elements are assumed to follow identical and independent normal distribution with mean vector μ and constant variance σ2 :

(ii) a set of explanatory variates x1, x2, ..., xp, all of length n, which may be regarded as forming the columns of a n × p model matrix X; it is assumed that the elements of μ are expressible as a linear combination of effects of the explanatory variables, so that we may write μi =

∑ j βjxij

or in matrix form μ = Xβ. Although it is common to write this model in the form

y = Xβ + e

where e is a vector of errors having normal distributions with mean zero, this formulation does not lead naturally to the generalizations we shall be making in the next Section.