ABSTRACT
The central theme of this chapter is modeling associations among variables. Understanding these associations can be important for many reasons, including
Reason 1. Prediction of future observations
Reason 2. Variable screening
Reason 3. System explanation
Reason 4. Parameter estimation
The primary tool used to model associations among variables in this chapter is regression. Regression analysis is used for modeling the relationship between a single variable Y , called the response or dependent variable, and one or more explanatory variables, also called predictor(s) or independent variable(s), x1, x2, . . . , xp1. The response variable must be a continuous variable, but the predictor variables can be either continuous, discrete, or categorical. The word “regression” is due to Sir Francis Galton, who demonstrated that o↵spring do not tend toward the size of the parents; rather, o↵spring size tends toward the mean of the population. That is, there is a “regression toward mediocrity.” The following examples illustrate scenarios where it is important to understand the associations among response and predictor variables.