ABSTRACT

This paper illustrates how the statistical structure of natural signals may help understand cognitive phenomena. We focus on a regularity found in audio visual speech perception. Experiments by Massaro and colleagues consistently show that optic and acoustic speech signals have separable influences on perception. From a Bayesian point of view this regularity reflects a perceptual system that treats optic and acoustic speech as if they were conditionally independent signals. In this paper we perform a statistical analysis of a database of audiovisual speech to check whether optic and acoustic speech signals are indeed conditionally independent. If so, the regularities found by Massaro and colleagues could be seen as an optimal processing strategy of the perceptual system. We analyze a small database of audio visual speech using hidden Markov models, the most successful models in automatic speech recognition. The results suggest that acoustic and optic speech signals are indeed conditionally independent and that therefore, the separability found by Massaro and colleagues may be explained in terms of optimal perceptual processing: Independent processing of optic and acoustic speech results in no significant loss of information.