ABSTRACT

An accumulating literature (Bakan, 1966; Carver, 1978; Cohen, 1994; Gigerenzer, 1993; Guttman, 1977, 1985; Meehl, 1967, 1978; Oaks, 1986; Pollard, 1993; Rozeboom, 1960; Serlin and Lapsley, 1993; Schmidt, 1992, 1996) has called for a critical reexamination of the common use of “null hypothesis significance testing” (NHST) in psychological and social science research. Most of these articles expose misconceptions about significance testing common among researchers and writers of psychological textbooks on statistics and measurement. But the criticisms do not stop with misconceptions about significance testing. Others like Meehl (1967) expose the limitations of a statistical practice that focuses only on testing for zero differences between means and zero correlations instead of testing predictions about specific nonzero values for parameters derived from theory or prior experience, as is done in the physical sciences. Still others emphasize that significance tests do not alone convey the information needed to properly evaluate

research findings and perform accumulative research. For example, reporting that results are significant at some pre-specified significance level (as in the early Fisher, 1935 or Neyman-Pearson, 1933, significance testing paradigms) or the p level of significance (late Fisher, 1955, 1959) do not indicate the effect size (Glass, 1976; Hays 1963; Hedges, 1981), nor the power of the test (Cohen, 1969, 1977, 1988), nor the crucial parameter estimates that other researchers may use in meta-analytic studies (Rosenthal, 1993; Schmidt, 1996). A common recommendation in these critiques is to report confidence interval estimates of the parameters and effect sizes. This provides data usable in meta-analyses. The confidence interval also provides a rough and easily computed index of power, with narrow intervals indicative of high power and wide intervals of low power (Cohen, 1994). A confidence interval corresponding to a commonly accepted level of significance (e.g. .05) would also provide the information needed to perform a significance test of prespecified parameter values.