ABSTRACT

Sometimes the fit of a simple linear regression model can be improved by adding one or more explanatory variables. For instance, the crop yield may depend not only on the amount of fertilizer but also on the amount of water. In this instance we have one response variable, the yield Y, and two explanatory variables, the amount of fertilizer x1 and the amount of water x2. Denote the relation between the expected yield and x1, x2 by f(x1, x2). The object of regression analysis is to describe the unknown response function f(x1, x2). In many cases, the response function can be approximated by a function that is linear in the variables; that is, we assume that the functional relation between the expected crop yield and the amounts of fertilizer and water is

E(Y |fertilizer = x1; water = x2) = β0 + β1x1 + β2x2 = f(x1, x2). Here, the values of the variables x1, x2 are known to the researcher while the regression

parameters β0, β1, β2 are not. Multiple linear regression is a statistical procedure for estimating and making inferences about the regression parameters. Since this procedure leads to tedious and complex calculations when done by hand we assume the student has access to a computer equipped with a standard statistical software package. Our primary goal in this chapter is to present those basic concepts of multiple linear regression that are essential for understanding the computer printouts. The most convenient and efficient way to reach this goal is to use vectors and matrices to represent the response variable, the explanatory variables, and the relations among them. For this reason we assume the student is familiar with the following basic concepts from linear algebra: vector space, subspace, vectors, matrices, matrix multiplication, and some additional concepts that will be introduced as needed. It is instructive to begin with the matrix formulation of the simple linear regression model studied in Chapter 10. This will not only ease the transition to the more complicated multiple linear regression models discussed later, but will also give us some additional insights into the simple linear regression model itself.