ABSTRACT

Although diverse, theories of visual attention generally share the notion that attention is controlled by some combination of three distinct strategies: (1) exogenous cueing from locally contrasting primitive visual features, such as abrupt onsets or color singletons (e.g., Itti & Koch, 2000); (2) endogenous gain modulation of exogenous activations, used to guide attention to task relevant features (e.g., Navalpakkam & Itti, 2005; Wolfe, 1994); and (3) endogenous prediction of likely locations of interest, based on task and scene gist (e.g., Torralba, Oliva, Castelhano, & Henderson, 2006). Because these strategies can make conflicting suggestions, theories posit arbitration and combination rules. We propose an alternative conceptualization consisting of a single unified mechanism that is controlled along two dimensions: the degree of task focus, and the spatial scale of operation. Previously proposed strategies – and their combinations – can be viewed as instances of this mechanism. Thus, our theory serves not as a replacement for existing models, but as a means of bringing them into a coherent framework. Our theory offers a means of integrating data from a wide range of attentional phenomena. More importantly, the theory yields an unusual perspective on attention that places a fundamental emphasis on the role of experience and task-related knowledge.