## Analysis of Variance II: School Attendance among Australian Children

The unbalanced nature of the data in Table 5.1 (there are different numbers of observations for the different combinations of factors) presents considerably more problems than encountered in the analysis of the balanced factorial data in the previous chapter. The main difficulty is that when the data are unbalanced, there is no unique way of finding a ‘‘sum of squares’’ corresponding to each main effect and each interaction, since these effects are no longer independent of one another. It is no longer possible to partition the total variation in the response variable into nonoverlapping or orthogonal sums of squares representing factor main effects and factor interactions. For example, there is a proportion of the variance of the response variable that can be attributed to (explained by) either sex or age group, and so, consequently, sex and age group together explain less of the variation of the response than the sum of which each explains alone. The result of this is that the sum of squares that can be attributed to a factor depends on which factors have already been allocated a sum of squares; in other words, the sums of squares of factors and their interactions depend on the order in which they are considered.