ABSTRACT

Segmentation of data is a fundamental requirement for analytics when dealing with customer background, sales transactions, customer surveys, and similar data. An appropriate segmentation of customer data can provide insights to enhance a company's performance by identifying the most valuable customers or the customers that are likely to leave the company. Cluster analysis (Duda et al., 2001; Jain et al., 1999; Jain and Dubes, 1988) attempts to segment a dataset of items or instances into clusters, where instances correspond to points in an n-dimensional Euclidean space. A cluster is a set of points in a dataset that are similar to each other but are dissimilar to other points in the dataset. Most traditional clustering techniques, such as feed-forward and supervised neural networks, rely on carefully crafted data points in terms of fixed-length vector structures of ordered n-tuples. Each component in a vector represents some feature of an object from the underlying problem domain.