ABSTRACT

This chapter provides a detailed overview of four additional unsupervised clustering techniques. It illustrates how the agglomerative clustering algorithm partitions instances into disjoint clusters. The chapter discusses Cobweb's incremental hierarchical clustering technique and shows how the EM algorithm uses classical statistics to perform unsupervised clustering. It offers an evolutionary approach to unsupervised clustering. RapidMiner and Weka support several unsupervised clustering algorithms. Weka's unsupervised algorithms can be found in the Weka Clusterers folder. Most of RapidMiner's clustering algorithms are housed under models in the segmentation subfolder. EM's solid statistical foundation as well as its similarity to the K-means algorithm makes it one of the most referenced clustering algorithms. The chapter provides two tutorials to obtain a better understanding about how the EM algorithm clusters numeric data. The tutorial titled RapidMiner's EM Operator applies RapidMiner's Expectation Maximization operator to the gamma-ray burst data set. The EM algorithm is a statistical technique that makes use of the finite Gaussian mixtures model.