ABSTRACT

All sound-separation systems based on perception assume a bottomup or Marr-like view of the world. Sound is processed by a cochlear model, passed to an analysis system, grouped into objects, and then passed to higher-level processing systems. The information flow is strictly bottom up, with no information flowing down from higherlevel expectations. Is this approach correct? In this chapter, I first summarize existing bottom-up perceptual models. Then, I examine evidence for top-down processing, describing many of the auditory and visual effects that indicate top-down information flow. I hope that this chapter generates discussion about what the role of top-down processing is, whether this information should be included in soundseparation models, and how we can build testable architectures.