ABSTRACT

We present a means of generating and understanding relative spatial positions in a natural three-dimensional (3-D) scene, in terms of six spatial prepositions, left, right, in-front, behind, above, and below, using real stereo images. Our model has two layers. First, a symbolic spatial description of the scene independent of reference frames is computed. Then, in the second layer, the meaning of each of the six prepositions is defined with respect to the current reference frame, based on the description from the first layer. The meaning definitions of the prepositions in the given model can be used in two ways. They allow the system to judge the degree of applicability of each of the six prepositions between two 3-D objects according to a graduated scale; and given the 3-D object description of one object, the admissible two-dimensional (2-D) image region of the other object can be inferred.