ABSTRACT

Test dimensionality, roughly defined as the minimum number of examinee abilities measured by the test items, is a unifying concept that underlies some of the most central issues in the development and use of large-scale tests. To illustrate, consider four practical concerns that are prominent in the Standards for Educational and Psychological Testing (AERA, APA, NCME, 1999). First, the content-related aspect of test validity requires consistency of the test dimensionality and the target test content structure, suggesting that the usual logical comparison of the test items with the test specification plan should be supplemented with an empirical analysis of test structure. In addition, dimensionality-related tools are also essential in the search for possible construct-irrelevant factors that may threaten the validity of the test. Second, the standard methods of computing the score reliability and precision assume that the test is unidimensional—an assumption that is almost always violated to some extent. Thus, the test developer must ask whether the estimated reliability accurately represents the actual reliability despite the violations of unidimensionality. When there is a serious lack of robustness to such violations, alternative methods of estimating reliability can be sought.