ABSTRACT

A key first step in any data analysis is to get to know the data you are working with. This chapter introduces the reader to descriptive statistics, which are used to explore, describe, and summarize data. This should be the initial step in data analysis to determine data cleaning needs, to explore each of the variables and their trends, to examine associations between and among variables, and to determine where a more detailed look at variables and associations is necessary. Hand calculations of the statistics are explained in detail, and statistical software programs (SAS® and Stata®) are used to show calculation of descriptive statistics and graphic methods using software. The chapter discusses key descriptive statistics, including measures of central tendency (arithmetic mean, median, mode, geometric mean), measures of spread and variability (range, interquartile range, percentiles, variance, standard deviation), graphic methods (bar charts, histograms, box plots, steam-and-leaf plots, scatter plots, choropleth GIS maps), detection of outliers in the data, and standard distributional rules (empirical rule, Chebyshev’s inequality).