Test Equating | 14 | Statistical Test Theory for the Behavioral Scienc

ABSTRACT

In many test situations, multiple forms of a test are made available to assess ability, achievement, performance, or whatever. When persons are administered several test forms meant to measure the same ability, we want to be able to compare these persons’ test scores. With parallel tests this can be done straightforwardly. Parallel tests measure the same content and share statistical specifications (equal means, standard deviations, and reliabilities). That is to say, scores on parallel tests are completely exchangeable. No comparison problem occurs with parallel forms of a test. More often than not, multiple forms of a test that measure the same attribute are not parallel, and a comparison of scores is not straightforward, because test forms may differ in several respects (unequal means, unequal variances, unequal reliability, and the like). So, before comparing persons’ or examinees’ scores on multiple forms of the same test, it is necessary to establish, as nearly as possible, an effective equivalence between raw scores on the multiple forms of a test. This is the problem of equating, a problem that is different from the problem of developing test norms as elucidated in Exhibit 11.1.