ABSTRACT

In this chapter, I discuss the lasso and sparsity, in the area of supervised learning that has been the focus of my research and that of many other statisticians. This area can be described as follows. Many statistical problems involve modeling important variables or outcomes as functions of predictor variables. One objective is to produce models that allow predictions of outcomes that are as accurate as possible. Another is to develop an understanding of which variables in a set of potential predictors are strongly related to the outcome variable. For example, the outcome of interest might be the price of a company’s stock in a week’s time, and potential predictor variables might include information about the company, the sector of the economy it operates in, recent fluctuations in the stock’s price, and other economic factors. With technological development and the advent of huge amounts of data, we are frequently faced with very large numbers of potential predictor variables.