Normal adults perceive a unity in the auditory and visual characteristics of many events in their world. When they look at and listen to a friend talking to them, they perceive the friend’s voice to be spatially and temporally related to his or her mouth and its movements. As a consequence of their experience with such spatio-temporal relationships, adults can often predict where and how an object will sound from its appearance, or where and how an object will appear on the basis of its sounds. Adults’ ability to perceive the invariant spatio-temporal characteristics of many multimodal objects and events reveals the impressive economy and stability of human perception.