ABSTRACT

The analysis of data collected on rock discontinuities often requires that the data be separated into joint sets or groups. A statistical tool that facilitates the automatic identification of groups of clusters of observations in a data set is always required. For this purpose several algorithms were formerly proposed which are mostly based on the Fuzzy K-means method. This is a widely used partitioning technique, where optimization involves minimizing the sum of squared distances between objects and the means of clusters (centroids). However, empirical studies have shown some drawbacks of this popular clustering method, such as the tendency to produce spherical clusters and the scale dependency. More importantly, it has been found that this method tends to generate equal sized clusters, often referred to as the equal-size problem, which is not adequate in distinguishing clusters with different sizes. One explanation for such a tendency is that a partition based on the minimization of the trace of dispersion matrix is equivalent to the maximum likelihood partition when the data is assumed to come from multivariate normal mixture distributions with equal covariance matrices. This paper introduces the application of a clustering method which applies an optimization criterion based on the estimated covariance matrix to overcome the so-called equal-size problem associated with the K-means method, while maintaining computational simplicity. Advantages of the proposed method are demonstrated using artificial data with low dimensionality.