ABSTRACT

This chapter describes an auditory model designed to stabilize repeating patterns ofphase-locking infonnation, a method of extending the model to binaural processing, the "data-rate problem" associated with auditory models as speech preprocessors and the construction of a noise-resistant binaural auditory spectrogram for speech recognition. It focuses on binaural processing is better understood than figure-ground separation or feature extraction at this point in time, and so binaural processing. The auditory image reveals the pitch and loudness of the source and its sound quality, or timbre. Binaural auditory spectrograms were produced using a pair of cochlear simulations and a binaural processor in two different ways. In the case of the traditional recognizer, a vector of 20 Mel Frequency Cepstral Coefficients spanning the frequency range 430 to 6641 Hz was calculated at 5-ms intervals for each sentence of speech in the database.