ABSTRACT

This chapter provides a brief overview of generalizability theory (G theory), a framework for modeling different sources of score variability contributing to measurement inconsistency. First, this chapter introduces basic concepts of G theory in relation to characteristics of classical test theory (CTT), highlighting a notable strength of G theory in allowing for the analysis of selected and constructed item responses for both norm-referenced and criterion-referenced assessments. Then a step-by-step demonstration is provided to illustrate how a dataset could be analyzed using G theory to address specific research questions concerning measurement consistency. Major considerations for determining rating designs are also discussed so as to ensure that a G theory analysis generates meaningful and interpretable results. As an illustrative example, the process of analyzing a language performance assessment dataset by using a univariate G theory analysis software, GENOVA, is described. Results of the analysis are discussed along with key considerations in designing and conducting language assessment studies using G theory.