ABSTRACT

This chapter introduces some general principles before briefly describing methods used for automatic speaker and language recognition. The input utterance is first analysed to give a sequence of feature vectors. For closed-set identification tasks, performance can be assessed by testing on a suitable number of samples of speech from each of the categories of interest and calculating the percentage of samples that are correctly recognized. When designing a verification system, the aim is to choose a value for the acceptance threshold that minimizes the number of verification errors. In real applications the preferred setting for the threshold parameter will depend on the cost associated with the two types of error, which will be different for different applications. The performance of verification systems is often shown graphically as a receiver operating characteristic curve. The feasibility of using text-dependent methods will depend on the type of application, and especially on whether or not the users can be regarded as being ‘co-operative’.