ABSTRACT

This chapter aims to facilitate the transfer of tools and frameworks between industry and academia, between software engineering and statistics and computer science, and across different domains. Exploratory data analysis can give learner a sense of their data, help identify issues with their data, bring to light any outliers, and help inform model construction. Kaggle.com7 is a machine learning and predictive modeling competition website that hosts datasets uploaded by companies, governmental organizations, and other individuals. The United States_births_1994_2003 data frame included in the fivethirtyeight package provides information about the number of daily births in the United States between 1994 and 2003. Let’s continue our Exploratory data analysis by creating multivariate visualizations.