ABSTRACT

In Chapter 1, we considered the standard linear model that underlies such common statistical methods as regression and analysis of variance (ANOVA; the general linear model). As noted, this model rests on several primary assumptions about the nature of the data in a population. Of particular importance in the context of multilevel modeling is the assumption of independently distributed error terms for the individual observations within a sample. This assumption essentially means that there are no relationships among individuals in the sample for the dependent variable once the independent variables in the analysis are accounted for. In the example described in Chapter 1, this assumption was indeed met, as the individuals in the sample were selected randomly from the general population. Therefore, nothing linked their dependent variable values other than the independent variables included in the linear model. However, in many cases the method used for selecting the sample does create correlated responses among individuals. For example, a researcher interested in the impact of a new teaching method on student achievement may randomly select schools for placement in treatment or control groups. If school A is placed into the treatment condition, all students within the school will also be in the treatment condition. This is a cluster randomized design in that the clusters (and not the individuals) are assigned to a specific group. Furthermore, it would be reasonable to assume that the school itself, above and beyond the treatment condition, would have an impact on the performances of the students. This impact would manifest as correlations in achievement test scores among individuals attending the school. Thus, if we were to use a simple one-way ANOVA to compare the achievement test means for the treatment and control groups with such cluster sampled data, we would likely violate the assumption of independent errors because a factor beyond treatment condition (in this case the school) would exert an additional impact on the outcome variable.