ABSTRACT

The goal of speech coding, or speech compression, is to represent speech in digital form with as few bits as possible while maintaining the intelligibility and quality required for the particular application. Speech and audio coding can be classified according to the bandwidth occupied by the input and the reproduced source. The term intelligibility usually refers to whether the output speech is easily understandable, while the term quality is an indicator of how natural the speech sounds. The mean opinion score (MOS) is an often-used performance measure. To establish a MOS for a coder, listeners are asked to classify the quality of the encoded speech in one of five categories: excellent, good, fair, poor, or bad. The diagnostic acceptability measure developed by Dynastat is an attempt to make the measurement of speech quality more systematic. The MPEG-4 audio coding standard specifies a complete toolbox of compression methods for everything including low bit rate narrowband speech, wideband speech, and high-quality audio.