ABSTRACT

Though indispensable in most language assessment contexts, classical test theory (CTT) has limitations since it does not factor in the effect of multiple sources of variance (e.g., tasks and raters) on test scores. By contrast, generalizability (G) theory, including univariate and multivariate classifications, not only provides an estimate of the relative effect of the ability being measured on the universe-score variance but also relative effects of specified test method facets on the scores in a single analysis. Since components in a composite (e.g., subscales in an analytic rubric in a performance assessment context) often represent the operational definition of the ability being measured, there are often hypothesized relationships between or among the underlying constructs, and subscales may or may not be functioning in expected ways. While the concept of univariate G theory can provide information about individual subscales separately, multivariate generalizability (MG) theory allows for a simultaneous investigation into the interrelationships among dimensions represented by the analytic subscales. This chapter illustrates some of the most common and meaningful applications of MG theory in the context of a listening-speaking test. Additionally, the strengths and limitations of MG theory are outlined, and complementary research methods are suggested.