ABSTRACT

Several techniques for identifying segment transitions in an audio stream are discussed. Gross features are first identified that control more detailed and computationally expensive analysis down stream. Pitch is tracked using some basic streaming principles, and then used as one cue to speaker transitions. A novel speaker discrimination technique is described that makes segmentation decisions when a continuously updated model of the current speaker suddenly ceases to sufficiently account for the input data.