ABSTRACT

In the last chapter we saw that any particular slope coefficient of a regressor generally depends on the other regressors included in the model. Hence, even if we are interested in the impact of only one of the regressors, it is important that we include all relevant variables in the regression equation. In this chapter we take this point further and show that if we omit a relevant variable from the model, least squares will no longer give us unbiased estimators of the coefficients of the population regression. The problem is misspecification due to the omitted variable bias. That is, unless all relevant variables are included in the regression equation, then none of the estimated parameters will be unbiased (except in a special case shown below). However, this result should not lead to a strategy of including every conceivable variable one can think off in a regression since, with collinear data, regression results are likely to yield foggy messages due to inflated standard errors of the coefficients. Variable selection in model specification is, therefore, a challenging task in applied research and has serious consequences for the validity of the inferences we make from our data.