ABSTRACT

In its early period, item writing was referred to as an art (Ebel, 1951), with less consideration for the science. Current attention is focused on developing the science of item writing. However, the science of item writing continues to lag behind the science of psychometrics (e.g., scoring, scaling, equating). Empirical research on item writing dates back to the early 1920s, a time when the science of psychometrics was also under development; even so, the attention to item development and validation has been limited. Reckase (2010) recently confessed that “test items are complicated” (p. 4) and that good item writers should be rewarded and honored. He argued that good item writing is a literary form that has not received its due recognition. As the building blocks of standardized tests, test items must be carefully and systematically developed in support of the validity of intended interpretations and uses of test scores. The Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association & National Council on Measurement in Education, 2014) provide guidance for item and test development (see Chapter 4), including a cluster of standards (Cluster 2, Standards for Item Development and Review) devoted to item development. Each of these standards is referenced when appropriate ahead.