ABSTRACT

Multiple Measures is the generic term used to refer to the practice of using more than one score or piece of information when making judgments about groups such as classes, schools, or local education agencies or judgments about individuals such as students or teachers (Brookhart 2009). The importance of using multiple measures to inform decisions is an established tenet of educational measurement and testing. It is engrained in professional standards for best practices, and codified in United States state and federal law (DePascale 2012).

The use of multiple measures is particularly recommended or required by law when classifications to be assigned or decisions to be made based on test scores have high stakes for individuals or groups (AERA, APA, NCME 2014; JCTP 2004). Common situations in educational settings for which the use of multiple measures is appropriate include decisions related to individual student promotion or graduation, college admission or course placement, educator evaluation, and school accountability. In each of those cases, the decision-making process will be enhanced by the use of data and information in addition to a single test score.

There are a variety of ways in which the use of multiple measures can enhance the accuracy, appropriateness, or fairness of a decision. This results in multiple, distinct meanings for the term multiple measures; with the appropriate interpretation dependent upon the context of the decision to be made (Brookhart 2009). One common use of the term refers to the practice of offering individual students multiple opportunities to take and pass a test used for high-stakes decisions such as high school graduation. For decisions such as high school graduation or college admission, multiple measures also refers to supplementing a test score with additional information such as course grades, teacher observations, or scores from different tests of the same construct. Portfolios, or collections of student work, which are evaluated by one or more teachers or external raters are another example of this interpretation of multiple measures (Moss 1994) A third interpretation of multiple measures refers to the use of information in addition to the individual or group achievement information provided by a test score.in order to provide a more complete picture of the performance of the individual or group being evaluated (Bernhardt 1998).

The decision to use multiple measures leads to the need to decide how best to combine information from the multiple courses to arrive at an appropriate decision. Three general categories of approaches to combining data from multiple sources are conjunctive, compensatory, and complementary (Brookhart 2009). Conjunctive approaches require a student or school to meet an established standard on each of the measures. Compensatory approaches, in contrast, allow higher performance on one measure to offset, or compensate for, lower performance on one or more other measures. Compensatory approaches also require decisions about how the scores from various measures will be combined to arrive at an overall score and how to establish a passing standard based on that overall score. Complementary approaches include a variety of approaches in which there are options on which measures will be used to inform the decision. Approaches to combining information from multiple measures may be primarily statistical/quantitative or judgmental/qualitative. In all cases, the approach use to combine information should be consistent with the objectives of administering multiple measures (Mosier 1951; Moss 1994).

Despite consensus on the importance of multiple measures to support appropriate and accurate decision-making, practical constraints often limit the manner in and extent to which multiple measures are used in real-life situations. Time, cost, availability or accessibility of additional data, and concerns regarding data quality and comparability, are among the factors impacting the use of multiple measures.