ABSTRACT

Many contemporary tests include both multiple-choice items and ex­ tended constructed-response items, for which the item scores are ordered categorical ratings provided by judges. Examples include the National As­ sessment of Educational Progress (NAEP) (Calderone, King, & Horkay, 1 997), many of the Advanced Placement (AP) examinations (College En­ trance Examination Board, 1 988), state assessments that have been ad­ ministered at times in North Carolina, Wisconsin, and in many other states, and other less visible testing programs. If the collection of items is sufficiently well represented by a unidimensional IRT model, scale scores may be a viable plan for scoring such a test.