ABSTRACT
Many contemporary tests include constructed-response items, for which the item scores are ordered categorical ratings provided by judges. When the judges' ratings use only two categories, the models described and il lustrated with such data in chapter 3 may be used. However, in most cases, responses to extended constructed-response items or performance exercises are relatively long, and their scoring rubrics specifY several graded categories of performance. The use of IRT with data from these kinds of items re quires generalizations of the models described in chapter 3, to accommo date the larger number of responses.