ABSTRACT

In this chapter, the authors begin with simple linear regression, and then move to multiple linear regression, the difference being the number of explanatory variables that the people allow. For historical and context-specific reasons there are a variety of terms used to describe the same idea across the literature. The most common way to summarize the range of uncertainty for a coefficient is to transform its standard error into a confidence interval. There are a variety of threats to the validity of linear regression estimates, and aspects to think about, particularly when using an unfamiliar dataset. The most important threat to validity and hence the aspect that must be addressed at some length, is whether this model is directly relevant to the research question of interest. Breiman describes two cultures of statistical modeling: one focused on inference and the other on prediction.