Signiﬁcance in Scale-Space for Clustering
An intuitive, visual approach to ﬁnding clusters in low dimensions is through the study of smoothed histograms, e.g. kernel density estimates. Scalespace provides a useful framework for understanding data smoothing. See Lindeberg (1994) and ter Haar Romeny (2001) for an excellent overview of the extensive scale-space literature. The scale-space approach has allowed practical resolution of several long-
standing problems in the statistical smoothing literature. See Chaudhuri and Marron (1999, 2000) for detailed discussion. For example, the classical problem of choice of the level of smoothing (bandwidth) can be viewed in an entirely new way using scale-space ideas. In particular, instead of choosing one level of smoothing, one should consider the full range of smooths (the whole scale-space). This corresponds to viewing the data at a number of diﬀerent levels of resolution, each of which may contain useful information. For clustering purposes, this simultaneous viewing of several diﬀerent
levels of smoothing incurs an added cost of interpretation. In particular, it becomes more challenging to decide which of the many clusters that are found at diﬀerent levels represent important underlying structure, and which are insigniﬁcant sampling artifacts. An overview of some solutions to this problem is given in Section 2.2. These solutions involve scalespace views of the data (i.e. a family of smooths), which are enhanced by visual devices that reﬂect the statistical signiﬁcance of the clusters that are present. In keeping with the visual nature of these new methods, only one and two
dimensional cases are presented. Certainly higher dimensional clustering is of keen interest, but visual implementation in higher dimensions represents a very signiﬁcant hurdle. For now, dimension reduction methods need to be applied ﬁrst, before these approaches can be used in higher dimensions. In Section 2.3 we propose a new enhancement of the two dimensional
version, based on the natural idea of contour lines.