Regularization Methods for Linear Models | 3

ABSTRACT

Our focus in this chapter has been the application of regularization and variable selection methods to the problem of multiple linear regression. As we noted at the start, regression is an extremely powerful and commonly used statistical tool that is employed by researchers across many different disciplines. Thus, regularization methods for linear models will have application in many fields. We began our discussion by focusing on the use of stepwise and best subsets regression, which are non-regularized approaches for identifying only the most salient independent variables to include in a regression model. We then moved our attention to the application of the lasso, ridge, and elastic net estimators in R, and saw how to identify whether regularization would be helpful, and if so how to identify the optimal tuning parameter. We then fit these estimators to the data and obtained inference for the resulting parameter estimates. Next, we described a Bayesian approach to the problem of regularization and saw how seamlessly estimation and inference fit together with this approach, when compared to the frequentist methods. Finally, we discussed the adaptive and grouped lasso and saw how relatively easy they are to use in R. In particular, we saw that the negative bias associated with the regularization methods was mitigated by the alasso, making it an attractive alternative. We finished the chapter with a demonstration of how to compare the fits from the many methods examined here in order to select the one that may be optimal for a given problem.

We will close this chapter by briefly discussing the issue of when to use what methods. The short answer to this question is that there is no short answer to this question. Much recent work has been conducted comparing the approaches with one another (see Statistical Science, 35(4), November 2020). Taken together, the results of these and other studies suggest that there is not one single approach that will always be optimal in every situation. Rather, we should try several methods and compare the results to one another, as described at the end of this chapter. The simulation studies in Statistical Science suggest that in many situations stepwise and best subsets regression are quite similar. In addition, it is important to note that whereas the regularization methods yield continuous functions relating the independent and dependent variables to one another, stepwise and best subsets don’t. In addition, the simulations demonstrated that the variance associated with stepwise and best subsets can be quite large when compared to that of the regularization methods. Thus, rather than thinking in terms of an optimal approach, we should think about the best method for a given research situation.