Graphical displays of data and descriptive statistics | 3 | v2

ABSTRACT

Examples of discrete, continuous, categorical, and ordinal variables are given, and the distinction between a sample and a population is made. A variety of graphical methods for the display of data are shown, including histograms, scaled to unit area, and cumulative frequency polygons for continuous data. Numerical summaries of data are given including the mean and median for location and standard deviation and inter-quartile range for spread. Box-plots and their potential for higlighting outlying values are introduced. The calculation of the mean and standard deviation from grouped data is also covered, and is related to the concept of expectation using probability density functions which are defined in Chapter 5. Another interpretation of the mean is discussed as the x-coordinate of the center of gravity of the histogram. The shapes of distributions, including the measures of skewness and kurtosis, are introduced through examples. The chapter ends with examples of displays for bivariate and multivariate data: the scatter plot; the three dimensional histogram; and the parallel coordinate plot. Software functions for graphics are used throughout the chapter, and statistics are calculated using both their definitions as formulae in terms of the data and inbuilt software functions. The experiment E3 Robot Rabbit is relevant to the material in this chapter, and also introduces the concept of sampling distributions. The chapter ends with: a summary of: notation used; the main results, MATLAB and R syntax; and exercises.