ABSTRACT

The use of automated scoring, as opposed to human scoring, in speaking assessments has increased. Commercial products that employ automated evaluation capabilities are rapidly emerging, such as EduSpeak, Carnegie Speech, Duolingo, and Liulishuo. These products are primarily intended to assist second language learning in a no- or low-stakes learning environment. Similar to the automated scoring of open-ended text responses, automated scoring of free-form spoken responses has certain advantages over human scoring. Compared to human scoring, automated scoring can facilitate near-instant score turnaround and give detailed, potentially diagnostic, performance feedback, as discussed in M. Zhang. Given the abundance of existing literature and frameworks, people do not intend to propose yet another framework. Instead, based on the study of existing literature, they choose to emphasize two important aspects of validation that are commonly called for by investigators, developers, and users. These are the two aspects central to most, if not all, validity-related studies in the field of automated scoring