ABSTRACT

This chapter highlights some key considerations when evaluating fairness related to measures comparing test scores across multiple grade levels for individual students from subgroups and aggregated subgroup performance. It encourages further discussions and research to help establish clearer guidelines for testing programs to use and follow when collecting fairness evidence for their cross-grade comparison measures. The chapter considers three general approaches—using vertical scales (e.g., a gain score or trajectory model), conditional status or normative growth models (e.g., Student Growth Percentile model), and comparing performance levels across grades (e.g., a value table model). The term categorical model is a general term for models that involve comparing performance levels across grades. Other terms include transition (matrix) models or value table models, which are often used interchangeably. In choosing cross-grade comparison measures, programs are faced with challenges of balancing accuracy, transparency, interpretability, and other validity concerns, such as within-grade construct validity, that may be partially in conflict with fairness considerations for cross-grade inferences.