ABSTRACT

The aim of many analyses of social science and medical data sets is to draw causal inferences about the relative effects of treatments, such as different methods of treating psychologically disturbed patients or cancer patients. The data available to compare many such treatments are not based on the results of carefully conducted randomized clinical trials, but rather are collected while observing systems as they operate in “normal” practice, without any interventions implemented by randomized assignment rules. Such data are relatively inexpensive to obtain, however, and often do represent the spectrum of actual practice better than do the settings of randomized experiments. Consequently, it is sensible to try to estimate the effects of treatments from such datasets, even if only to help design a new randomized experiment or shed light on the generalizability of results from existing randomized experiments. Standard methods of analysis using routine statistical software (e.g., linear or logistic regressions), however, can be quite decep-

tive for these objectives because they provide no warnings about their propriety. Propensity score methods, introduced by Rosenbaum and Rubin (1983a), are more reliable tools for addressing such objectives because the assumptions needed to make their answers appropriate are more assessable and transparent to the investigator. Subclassification on propensity scores is a particularly straightforward technique and is the topic of this chapter. Because these techniques are so straightforward, they seem especially appropriate to review in this Festschrift for Ralph Rosnow, who has made such substantial contributions to the teaching and dissemination of straightforward statistical methods in psychological research. It has been a true pleasure to work with Ralph as a friend and collaborator (e.g., Rosenthal, Rosnow, & Rubin, 2002).