ABSTRACT

This chapter presents a comprehensive discussion about outliers. First, outliers are defined, their causes are explained, and their effects on regression estimates are detailed. Next, various methods for identifying outliers are presented, such as using the “leverage” statistic, analyzing standardized residuals, and using the Cook’s distance. Then, strategies for dealing with outliers are given, including maximum likelihood with heavy-tailed distributions, quantile regression, and Winsorization. Winsorization is discussed in great detail, to the conclusion that Winsorization should not be used. Supporting examples, simulations and R code are provided throughout the presentation.