ABSTRACT

This chapter focuses on the acoustic-phonetic level. Human speech is produced as a result of muscular control of the articulators. The complex consequences of simple gestures have led some people to suggest that rules for the phonetic level of synthesis would be easiest to specify in articulatory terms, for driving an articulatory synthesizer. The most fundamental argument against using articulatory rules is that when humans acquire speech it is the auditory feedback that modifies their behaviour, without the speaker being consciously aware of the articulatory gestures. Phonetic rule systems have been developed in a number of laboratories, and some have been incorporated in commercial text-to-speech products. It can happen that the transition durations as specified in the tables are so long in relation to the element duration that there is insufficient time for the element to contain both the initial and final transitions without them overlapping in time.