ABSTRACT

Speech recognition is the process of automatically extracting and determining linguistic information conveyed by a speech wave using computers or electronic circuits. This chapter focuses on the multi-template method, in which multiple templates are created for each vocabulary word by clustering individual variations; and the statistical method, in which individual variations are represented by the statistical parameters in hidden Markov models. It also focuses on speaker normalization and adaptation methods, in which speaker variability of input speech is automatically normalized or speaker-independent models are adapted to each new speaker. Automatic speech recognition methods have been investigated for many years aimed principally at realizing transcription and human–computer interaction systems. The staggered array dynamic programming matching method realizes complete symmetrical unconstrained endpoint matching and reduces the amount of computation by thinning out the lattice points in a plane spanned by two time sequences. Word spotting techniques have been applied to a wide range of problems that can suffer from unexpected speech input.