Signal and Feature Compensation Methods for Robust Speech Recognition

doi:10.1201/9781315220109-9

ABSTRACT

Automatic speech recognition systems are pattern classifiers designed to solve a rather specific statistical pattern classification problem. For the benefit of readers with a limited background in speech recognition technologies, this chapter begins by reviewing the formulation of automatic speech recognition as a statistical pattern classification process and by discussing how environmental disturbances adversely affect classifier performance. Linear spectral subtraction, also referred to simply as spectral subtraction, is a method of canceling additive uncorrelated noise from a noisy speech signal. In spectral subtraction, signals are separated into speech and nonspeech regions by a variety of techniques, and all regions deemed to be nonspeech are used to estimate the noise spectrum. Nonlinear spectral subtraction attempts to improve the noise cancellation of spectral subtraction by making the oversubtraction factor for any frequency band dependent on the local signal-to-noise ratio. The chapter describes selected signal and feature compensation techniques in current usage, which were chosen on the basis of their efficiency and generality.