ABSTRACT

Speech is a striking human skill, both in terms of the precision of the motor acts which production involves, and the complexity of the acoustic signal that we perceive as meaningful words. Our perception of speech is both robust and flexible: We are able to follow speakers with a wide variety of different accents and moods, in very adverse auditory environments. This perceptual ability is all the more impressive when the actual nature of the speech signal is considered. Speakers do not produce simple strings of regular phonemes which are then sequenced into words by the listener. There is a lot of variability in the acoustic signal, resulting from speaker differences, co-articulation and assimilation effects. This variability precludes a simple linear mapping between the acoustic signal and the identity of the phone that is expressed (Bailey & Summerfield, 1980). It is also important to note that the ‘surface’, acoustic representation of speech is not wholly separable from the intended meaning: Coarticulation has been suggested to have a communicative quality (Whalen, 1990) in addition to its role in making speech production more fluid, assimilation effects are constrained by syntactical features (Hawkins, 2003), prosody influences the linguistic information in speech at many levels. The aim of this chapter is to delineate the neural systems involved in speech perception, rather than the whole language system. I argue that neurally, robust and flexible speech perception is supported by multiple, parallel processing streams. I argue that these streams bear some relation to the anatomy of primate auditory cortex. I also address the basis of functional asymmetries between the left and right temporal 142lobes, which have often been invoked to explain the left hemispheric dominance in speech perception.