ABSTRACT

Standard setting is the process of determining scores on an assessment used to classify examinees into interpretable levels of performance. Item response theory adds considerable value to the practice of standard setting via the characterization of the proficiency of examinees and the difficulty of test items on a common scale. In this article, we provide a brief review of non-IRT standard-setting methods and the evolution of item mapping that led to its use for score interpretation and standard setting. Then, we describe applications of common IRT models to support standard-setting, policy decisions that are uniquely associated with IRT-based standard-setting methods, and how these decisions affect the standard setting methodologically. Following that, we provide an overview of a typical IRT-based standard-setting workshop and conclude with a discussion of current trends in IRT-based standard setting, future directions, and open research questions.