ABSTRACT

Clustering algorithms tackle the problem of grouping similar data points in a dataset. Self-Organizing Maps, or Kohonen Maps is an example of a clustering algorithm that is commonly used for grouping and data visualization, but it falls into the pitfall of not being able to completely or effectively cluster a dataset that is extremely big or extremely complex due to its relatively slow training time. The researchers propose using Bradley-Fayyad-Reina clustering as a preprocessing step to first summarize the dataset so SOM can be used and trained, and provide an empirical analysis and testing framework to evaluate the given algorithm.