ABSTRACT

A neural network model for object recognition based on Biederman’s (1987) theory of Recognition by Components (RBC) is described. RBC assumes that objects are recognized as configurations of simple volumetric primitives called geons. The model takes a representation of the edges in an object as input and, as output, activates an invariant, entry-level representation of the object that specifies the object’s component geons and their interrelations. Local configurations of image edges first activate cells representing local viewpoint-invariant properties (VIPs), such as vertices and 2-D axes of parallelism and symmetry. Once activated, VIPs are bound into sets through temporal synchrony in the firing patterns of cells representing the VIPs and image edges belonging to a common geon. The synchrony is established by a mechanism which operates only between pairs of a) collinear, b) parallel, and c) coterminating edge and VIP cells. This design for perceptual organization through temporal synchrony is a major contribution of the model. A geon’s bound VIPs activate independent representations of the geon’s major axis and cross section, location in the visual field, aspect ratio, size, and orientation in 3-space. The relations among the geons in an image are then computed from the representations of the geons’ locations, scales and orientations. The independent specification of geon properties and interrelations uses representational resources efficiently and yields a representation that is completely invariant with translation and scale and largely invariant with viewpoint. In the final layers of the model, this representation is used to activate cells that, through self-organization, learn to respond to individual objects