ABSTRACT

Assessments that operate on spoken responses have been used for many centuries, in job interviews, medical diagnoses, and formal schooling. This chapter will present the methods and limits of automatic scoring of spoken responses in testing and instructional contexts. Much of the content of a spoken response (word sequences, propositions, some hedges and valence cues) would be found in a text transcription of the response, but many speech-borne cues have no conventional orthographic representation. This chapter describes the methods currently available for extracting the lexical content and other information from spoken responses in the context of assessments, which may include selection and qualification, as well as formative guidance in learning.

Following a short introduction in Section 1, Section 2 reviews the kinds of information that can be extracted from speech signals and their relation to information in text. Speech includes linguistic, paralinguistic, and indexical information. This chapter will focus on linguistic and paralinguistic aspects of speech, although indexical information has uses in test security and instruction. The most important paralinguistic aspects of speech provide evidence of speaker states – emotional, physical, and cognitive. In assessments, speaker traits may be a part of the intended score-use construct. Section 3 describes the spoken language processing methods used to extract assessment-relevant information from speech signals, including the lexical sequence from ASR, as well as fine-grained data in the time domain (segment, syllable, word, and phrase times; latency and pause times) and in the frequency/amplitude domain (energy; formant tracks; voicing, F0, jitter and shimmer, harmonics/noise, PLP components). The accuracy, consistency, and applicability of these measures will be covered. Section 4 reviews a range of SLP applications in assessment, and then details two assessment applications of spoken language processing in education and psychology. These are childhood reading development (accurate rate, fluency, perseverance, comprehension) and second language development in adults. A final section provides a note on transparency and bias in these technologies.