ABSTRACT

Comparisons of objects with different numbers of nodes, length of strings, size or shapes creates a possible breakdown in the meaning or utility of a symmetric distance or similarity measure, like the Euclidean distance or the Tanimoto similarity. A very simple example from the social sciences regards clustering a larger group of individuals into a number of social cliques. Individuals rank their feelings or perceptions of one another, say, on a scale from 1 to 10. It is easy to see how a proximity matrix of such values is asymmetric since the pairwise values between individuals may differ - Ida likes Joey (the (Ida, Joey) entry), but Joey just tolerates Ida (the (Joey, Ida) entry). Such data can then be clustered to find groups of like-minded individuals (social cliques). Another simple example involves shapes. Two objects of the same shape, two triangles, might be quite “similar” to an observer, but a symmetric measure would measure them as quite dissimilar, if the two triangles are quite different in size. Thus, any asymmetric measure is one in which the order of the comparison of two objects may result in the a measure having different values. In Figure 8.1, there are two objects (in (a) and (b)) being compared. A similarity measure would find these objects to be quite “different” if their size difference is included in the similarity comparison. However, the fact that the triangle in Figure 8.1b can “fit into” the triangle in Figure 8.1a may be an interesting way to group these objects together. An asymmetric measure would have a high value for the case of Figure 8.1b being compared with Figure 8.1a and an low value for the case of the object in Figure 8.1a “fitting” into the triangle depicted in Figure 8.1b.