chapter  4
28 Pages

Exploratory Visualizations

ByMax Kuhn, Kjell Johnson

This chapter presents approaches for visually exploring data and to demonstrate how this approach can be used to help guide feature engineering. One of the first steps of the exploratory data process when the ultimate purpose is to predict a response, is to create visualizations that help elucidate knowledge of the response and then to uncover relationships between the predictors and the response. The chapter explores relationships among the predictors and between predictors and the response. It describes a variety of useful visualization tools for exploring data prior to constructing the initial model. Univariate visualizations are used to understand the distribution of a single variable. A few common univariate visualizations are box-and-whisker plots, violin plots, or histograms. A variable that is collected over time presents unique challenges in the modeling process. To illustrate different visualization techniques for qualitative data, the OkCupid data are used.