ABSTRACT

One of the most famous categorical data sets in biology is the plant breeding experiment of the Austrian monk and botanist G. Mendel (1822-1884) who classified peas according to their shape (round (r) or wrinkled (w)) and color (yellow (y) or green (g)). Each seed was classified into one of four categories: (ry) = (round, yellow), (rg) = (round, green), (wy) = (wrinkled, yellow), and (wg) = (wrinkled, green). According to Mendel’s theory the frequency counts of seeds of each type produced from this experiment occur in the following ratios: 9 : 3 : 3 : 1. Thus, the ratio of the number of (ry) peas to (rg) peas should be equal to 9 : 3, and similarly for the other ratios. Mendel’s experiment raises the following fundamental scientific question: Are Mendel’s data, displayed in Table 9.1, consistent with the predictions of his scientific model? To answer these and more complex questions arising in the study of categorical data, statisticians use the chi-square test.