ABSTRACT

Increasingly over the past decade, standardized tests have come under public scrutiny for possible bias against various subgroups including ethnic, racial, and gender subgroups. (See chapter 20 by McAllister for a discussion of testing, differential item functioning, and public policy.) However, the issue of fairness in items and tests long has been a matter of great concern to test developers. At testing companies such as Educational Testing Service (ETS), judgmental review procedures are used to prevent offensive or stereotyped material from being included in any test and to ensure that the tests reflect the multicultural society in the United States to the extent allowed by test specifications. (See chapter 19 by Ramsey for a description of this process.) Reviews of tests for statistical evidence of differential item functioning (DIF) also are used to ide_B.tify items that evidence differential performance by groups of examinees. In addition to these reviews, DIF research is conducted in an effort to improve the items included in every test. This research is a response to the concerns of test developers, test users, and examinees to ide_B.tify characteristics of items that may cause significant group differences in difficulty.