Inferring prosody from facial cues for EMG-based synthesis of silent speech

doi:10.1201/b12525-78

ABSTRACT

In this paper we introduce a system which is able to detect prosodic elements in a spoken utterance based on signals from the facial muscles. The proposed system can augment our surface electromyography (EMG) based Silent Speech Interface in order to make synthesized speech more natural. Having shown in (Nakamura, Janke, Wand, & Schultz, 2011) that it is possible to produce understandable synthesized speech from EMG signals, our current interest is to improve the quality and expressivity of the synthesis.