ABSTRACT

The ultimate objective in data science is to extract meaning from data. Data wrangling and visualization are tools to this end. Wrangling re-organizes cases and variables to make data easier to interpret. Visualization is a primary tool for connecting our minds with the data, so that we humans can search for meaning. Visualizations are powerful because human visual cognitive skills are strong. People can easily be misled by the accidental, evanescent patterns that appear in random noise. Statistical methodology is governed by a broader point of view. This chapter introduces key ideas from statistics that permeate data science. It elucidates some of the connections between the sample-the data we've got-and the population.