A Survey of Uncertain Data Clustering Algorithms

doi:10.1201/9781315373515-18

ABSTRACT

This chapter provides a survey of clustering algorithms for uncertain data. It discusses mixture model clustering of uncertain data and describes density-based clustering algorithms for uncertain data. Mixture model clustering is a popular method for clustering deterministic data, and it models the clusters in the underlying data in terms of a number of probabilistic parameters. Density-based methods are very popular in the deterministic clustering literature, because of their ability to determine clusters of arbitrary shapes in the underlying data. The work in designs methods for speeding up distance computations for the clustering process. The microclustering model was first proposed in for large data sets and subsequently adapted in for the case of deterministic data streams. In the case of the standard clustering problem, the main effect of uncertainty is the impact on the distance computations. The interplay between the clustering of the values and the level of uncertainty may affect the subspaces which are most optimal for the clustering process.