ABSTRACT

Categorical data take the form of counts of individuals falling into two or more discrete classifications. Data of this type are common in medical studies and are presented in contingency tables that show the number of persons belonging to each classification. Seventy percent of the articles in our survey of the New England Journal of Medicine used contingency tables as a means of describing characteristics of the patients under study. Therefore, consumers and producers of medical research reports should understand the concepts and issues that arise in interpreting contingency tables. We present basic concepts and analyses and describe the scope of scientific questions addressable by contingency analysis. Small 2 × 2 tables of counts are easily analyzed using either Fisher’s exact test or Pearson’s chi-square. Tables with three or more dimensions are often in original data, but do not appear often in the Journal Injudiciously collapsing tables can create or eliminate interaction between factors. Simpson’s paradox may occur if proper randomization is not performed in the allocation of subjects to all possible treatments. Logistic regression (including multiple logistic regression) is a powerful tool in analysis and has become common in the Journal .