ABSTRACT

Avatar lip syncing plays a vital role in making a speech animation realistic. There have been a number of studies in this domain due to its various applications in fields like animation, gaming and the recent use in the education and the business domains for interactive chat-bots. In this paper, the authors pre- sent a method to achieve avatar lip sync in real-time where, when a text is entered, the avatar utters it out with pauses corresponding to the punctuation. This thereby creates a feeling that the avatar is actually speaking. Most of the previous works involve the usage of 3D avatars and a facial mesh, and locating the positions of motion as the speech is generated. This involves key-frame positioning during the time lapse of the speech. This may involve some skilled animators and may make some naïve developers anxious about the idea of usage of lip syncing in their possibly basic projects. The goal of this work is to provide a similar experience of lip sync with a 2D image of the 3D model. In this work, the authors identify words in the text entered, extract phonemes and do a standard mapping of the phoneme to viseme. Thus, they overlay the images of the avatar, corresponding to the visemes to create an illusion of lip sync with the speech.