ABSTRACT

This chapter reviews the major concepts and algorithms for density-based clustering. It discusses the key aspects of density estimation, connectivity definition, and data structures for efficient implementation. The chapter anlayses advanced density-based approaches to subspace clustering and the clustering of network data as well as the clustering of data streams and uncertain data. Density-based clustering can be considered as a non-parametric method, as it makes no assumptions about the number of clusters or their distribution. Density-based clusters are connected, dense areas in the data space separated from each other by sparser areas. DBSCAN estimates the density by counting the number of points in a fixed-radius neighborhood and considers two points as connected if they lie within each other’s neighborhood. DENCLUE takes approach to generalize the notion of density-based clusters, based on the concept of influence functions that mathematically model the influence of a data point in its neighborhood.