ABSTRACT

ABSTRACT:   This paper aims at providing an approach for a user-dependent emotion recognition system of affective speech based on multiple features using prosodic information and Higher Order Spectra (HOS) analyses. Prosodic features including speech rate, short-term energy, and pitch-related features are extracted to indicate the effect of linear aspects, and HOS analyses are utilized to indicate the impact of nonlinear aspects of affective speech signal. Some significant representatives are taken to form the reduced feature set and build the classifiers for emotion recognition. The proposed algorithm is implemented on our six-leg robot, which is designed for search and rescue tasks in real disaster sites using the predictive accuracy of dynamic time warping. The effectiveness of this algorithm in both emotional speech signal recognition and feature dimensionality reduction is illustrated with some experimental results. An important finding is that the proposed algorithm is shown to be effective for identifying negative-type emotion with excellent predictive capability.