Models of Speech Processing: Martha W. Burton and Steven L. Small

ABSTRACT

One of the fundamental questions of language is how listeners map the acoustic signal onto syllables, words, and sentences, resulting in understanding of spoken language. For normal listeners, this mapping is fast and effortless, taking place so automatically that in everyday conversation we rarely think about how it might occur. Studies of the speech signal have provided much evidence that, in fact, the system contains a great deal of underlying complexity that is not evident to the casual listener. Several different models are currently competing to explain how these intricate speech processes work (Klatt, 1979; Marslen-Wilson & Warren, 1994; McClelland & Elman, 1986; Norris, McQueen, & Cutler, 2000). These models have narrowed the problem to mapping the complex speech signal onto isolated words, setting aside the complexity of segmenting continuous speech. Continuous speech remains a major challenge for most models because of the increased variability of the signal and known difficulties segmenting speech into words.