ABSTRACT

Text clustering is a fully automatic processing of text set grouping process; machine learning is a kind of typical guidance process. Text clustering goal is to find such a collection of classes, the degree of similarity between classes as least as possible, and the maximum similarity within the class. As a kind of unsupervised machine learning method, clustering doesn’t need training process, also does not need to manual annotation category document in advance, so the clustering technology is very flexible and has higher ability of automated processing, now has become the effective organization, the text information and an important means of navigation, more and more researchers are paying attention on it.