ABSTRACT

This chapter presents classical grid-based clustering algorithms as well as those algorithms that directly address the challenges of nonuniformity, locality, and high-dimensionality. Grid-based clustering algorithms are efficient in mining large multidimensional data sets. These algorithms partition the data space into a finite number of cells to form a grid structure and then form clusters from the cells in the grid structure. Grid-based clustering algorithms typically involve the following five steps: creating the grid structure, calculating the cell density for each cell, sorting of the cells according to their densities, identifying cluster centers and traversal of neighbor cells. The effectiveness of a grid-based clustering algorithm is seriously limited by the size of the predefined grids, the borders of the cells, and the density threshold of the significant cells in the face of local variations in shape and density in a data space. The greatest challenge for using these algorithms is determining the best strategy of constructing the grid structure.