Breadcrumbs Section. Click here to navigate to respective pages.
Chapter

Chapter
Handling Missing Data
DOI link for Handling Missing Data
Handling Missing Data book
Handling Missing Data
DOI link for Handling Missing Data
Handling Missing Data book
ABSTRACT
This chapter focuses on methods for resolving missing values within the predictors. The goal of feature engineering is to get the predictors into a form which models can better utilize in relating the predictors to the response. Missing values in the original predictors, regardless of any feature engineering, are intolerable in many kinds of predictive models. One framework to view missing values is through the lens of the mechanisms of missing data. Three common mechanisms are: structural deficiencies in the data, random occurrences, or specific causes. A co-occurrence plot can further deepen the understanding of missing information. This type of plot displays the frequency of missing predictor combinations. Simple numerical summaries are effective at identifying problematic predictors and samples when the data become too large to visually inspect. Many popular predictive models such as support vector machines, the glmnet, and neural networks, cannot tolerate any amount of missing values.