ABSTRACT

In the previous chapter, we considered what might be called attributed data: sets of records, each of which specified values of the attributes of each object. When such data is clustered, the similarity between records is based on a combination of the similarity of the attributes. The simplest, of course, is Euclidean distance, where the squares of the differences between attributes are summed to give an overall similarity (and then a square root is taken).