ABSTRACT

In the previous chapter we described a procedure, the t-test, for testing whether a difference between two means is significant. The next logical questions is, What do we do when we have three or more means? One apparent answer is to repeat the t-test, taking two means at a time until all possible comparisons have been made. Thus, if we have five groups of subjects, we would conduct ten separate t-tests to find out which pairs of means are significantly different. Running that many tests can be tedious, but there are two other more important reasons why this procedure is generally not accepted by most statisticians. (1) If we run a large number of tests of significance, we can expect as a matter of chance that a certain proportion of these tests, a proportion defined by alpha, will be significant. For example, if we conduct one hundred t-tests, the laws of probability tell us that five of these tests will be statistically significant at the .05 level and one will be significant at the .01 level. Multiple tests, therefore, increase the likelihood of making an incorrect decision concerning the null hypothesis. (2) In more sophisticated research designs where there are two or more independent variables, we must acknowledge the fact that these variables can and do interact with one another. Because the t-test does not "partial out" these interaction effects, it will necessarily produce results that are not free from this bias.