ABSTRACT

In Chapter 4 we compared the means (or medians) of two samples using a t test (for normally distributed data) or a Mann-Whitney U test (for ordinal or non-normal data). However, many experiments or surveys involve the comparison of three or more samples. For example, in a survey of household energy bills, we might place houses in three categories depending on the type of insulation they have—loft insulation, double-glazing or minimal insulation. To investigate the effect of insulation type on household energy bill, it might seem logical to perform three t tests (comparing 'loft' with 'double-glazing', 'loft' with 'minimal' and 'double-glazing' with 'minimal'). This would certainly be possible. However, if we had more samples, the computation would become very long-winded and time-consuming (for seven samples, for example, we would need 21 separate tests). There is a more serious objection to conducting multiple t tests: that is, when we compute several tests we increase our chances of obtaining a significant result by chance alone. Remember that the critical probability value is 0.05. This is the value at which the null hypothesis is rejected because there is only a 5% probability that the result occurred by chance. We can turn this around and say that on 5% of occasions the null hypothesis is rejected when in fact it is true (this is a type I error: see Chapter 3). It now becomes clear that if we do 20 tests, at least one may appear to be significant even if there were no real differences between the means.