ABSTRACT

Analysis of variances (ANOVA) tests are used to compare three or more samples in terms of some numerical dependent variable. ANOVA tests will determine whether or not there are any significant differences between the samples. An immediate question that comes to mind is why can a series of t-tests not be used to compare the samples? After all, if there are differences between the samples, then we will wish to compare each pair of samples anyway. Indeed, in this age of computing power, a series of t-tests can be conducted relatively quickly even if there are multiple samples being compared. There are two main reasons for using a single ANOVA test to compare three or more samples. The first reason is that if there are no significant differences between the samples, then this one single ANOVA test will determine this and the result can be expressed concisely. The second reason for using an ANOVA test is to avoid inflating the probability of making a Type I Error when comparing multiple pairs of samples using t-tests. Imagine if we were comparing the six different socio-economic groups with respect to some numerical variable. There are 15 different pairs of groups here. If the probability of making a Type I Error is 0.05 (one chance in 20 of making a mistake when concluding there is a difference) each time we compare a pair of socio-economic groups, then our chance of avoiding making a Type I Error somewhere decreases to 0.9515 which is 0.463. Therefore, the chance of making a Type I Error somewhere in our series of 15 ttests has inflated to 0.537. Another way of thinking about multiple pairwise comparisons is a person trying to cross a river using 20 stepping stones between the two banks. Each time the person steps on a stepping stone, there is one chance in 20 that the person will fall in the river. So with 20 stepping stones to be used, we can expect that the person will probably fall in the river once when they try to cross. Therefore, an ANOVA test provides a means of restricting the chances of making a Type I Error. This is because it is a single test which determines whether or not there are differences between the samples.