Describing Categorical Data: Frequency Distributions, Graphics, and Summary Statistics
In Chapters 2 and 3 we presented techniques for aggregating and describing sets of quantitative measurements, focusing on the properties of centrality, dispersion, and shape. In this chapter we present techniques for summarizing the properties of nominal measurements, where the numerals have no numeric meaning and are simply labels for the categories into which our observations may be classified. For example, in the two previous chapters, we have been working with a data set including data for 200 children entering kindergarten, 100 of which were White and 100 of which were African American. These data were randomly sampled from the larger data file described in the codebook contained on the book website. In the larger file of 2,577 cases, there are several other racial/ethnic groups represented. If you look at the codebook as related to the variable “race,” you will note that values of one (1) refer to “White, Non-Hispanic,” values of two (2) refer to “Black or African American, Non-Hispanic,” values of three (3) refer to “Hispanic, Race Specified,” etc. Looking at the column of values for “race” in the data file, you will see integer values 1 through 8, with –9 for missing values. The integer simply informs you of the race/ethnic group of that particular case.