ABSTRACT

CONTENTS 8.1 Speech Recognition: History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 8.2 ASR-Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 8.3 ASR Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 8.4 Phones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 8.5 Phonetic Alphabets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 8.6 Deterministic Sequence Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 8.7 Statistical Sequence Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 8.8 Language and Auditory Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 8.9 Speech Recognition Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 8.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

8.1 Speech Recognition: History Speech recognition has had a long history over the last 100 years. However, only recently has the dream of “talking to a computer” become an effective reality. One of the most notable and earliest speech recognisers was Radio Rex. There were no serious attempts at ASR for another 30 years when Bell Labs developed an isolated digit recognition system [2]. From that point on, a gradual improvement in speech recognition methods have been achieved with increases in vocabulary and robustness. Current speech recognition systems using Deep Networks have a recognition performance very close to that of a human.