Hypothesis testing with categorical data | 15

ABSTRACT

This chapter extends two-sample hypothesis testing from means (Chapter 8) and medians (Chapter 9) for continuous data to the categorical data setting. This chapter covers two-sample tests for proportions (using normal theory, contingency tables, and odds ratios), the Pearson chi-square test for independent data, and the McNemar test for paired data. Readers are introduced to the chi-squared distribution and the use of the chi-squared statistical table (Appendix Table A.5) to determine critical values and p-values for hypothesis testing. The relationship between the chi-squared distribution and normal distribution is discussed in relation to the different approaches that can be used for testing a hypothesis with categorical data. The chapter also includes hand-calculating procedures for using continuity corrections when estimating a discrete distribution using a continuous distribution. Categorical hypothesis testing methods are discussed in an epidemiologic framework that implements contingence tables (2 x 2 and r x c) and odds ratios. Sample size and power calculations for two-sample tests of proportions are discussed briefly with appropriate statistical software code (SAS and Stata) provided.