ABSTRACT

Vision theories can be categorized in terms of the amount of explicit representation postulated in the perceiver. Gibson’s precomputational theory eschewed any explicit representation. In contrast, Marr used layers of explicit representation, hoping to simplify vision computations. Current technological advances in robotic hardware and computer architectures have allowed the building of anthropomorphic devices that capture important technical features of human vision. Experience with these devices suggests that cooperative gaze-control behaviors can reduce the need for explicit representation. This view is captured in the notion of “animate vision,” which is a framework for sequential decision making and visual learning.