ABSTRACT

As the discovery of information from text corpora becomes more and more important there is a necessity to develop clustering algorithms designed for such a task. One of the most, successful approach to clustering is the density based methods. However due to the very high dimensionality of the data, these algorithms are not directly applicable. In this paper we demonstrate the need to suitably exploit the already developed feature reduction techniques, in order to maximize the clustering performance of density based methods.