Fundamentals of Speech Processing | 5 | Random Signal Processing

ABSTRACT

This chapter introduces basic processing concepts and algorithms used for processing of speech signals. It explains linear time-invariant and linear time-varying models for speech. The chapter discusses the speech parameters such as fundamental frequency, formants, pitch contour, mel frequency cepstral coefficients, and linear predictor coefficients. The mel-frequency cepstrum can be defined as the short time power spectrum of speech signal, which is calculated as a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. Speech signal is basically a convolved signal resulting when excitation signal gets generated at the sound box called the larynx. The speech signal is generated by god-gifted components like vocal cords and vocal tract. Speech is basically a convolved signal convolution of excitation and the impulse response of the vocal tract. Speech signal consists of voiced segment and unvoiced segment.