ABSTRACT

This chapter introduces an approach for clustering and visualizing high-dimensional data, especially textual data. The self-organizing map (SOM) is a neural network paradigm for exploratory data analysis. The SOM is equipped with an unsupervised and competitive learning algorithm. It consists of an array of neurons placed in a regular, usually two-dimensional grid. The neurons are connected to adjacent neurons by a neighborhood relation, which dictates the structure. The neurons most often are connected to each other via a rectangular or hexagonal grid structure. The SOM is an unsupervised neural network, which means the training of a SOM is completely data driven. The form of the neighborhood function determines the rate of change around the winner neuron. The simplest neighborhood function is the bubble function, which is constant over the defined neighborhood of the winner unit and zero elsewhere. The learning rate and the neighborhood function together determine which neurons and how much these neuron are allowed to learn.