Partitional Algorithms | 14 | Clustering in Bioinformatics and Drug Di

ABSTRACT

Partitional clustering algorithms are quite popular in both bioinformatics and drug discovery. Variations of K-means continue to spring up in the literature, largely in an eﬀort to tackle larger and larger data sets eﬃciently. JarvisPatrick was a popular method for clustering large scale data sets of binary ﬁngerprints early on in cheminformatics, but has been supplanted largely by K-means-like and sampling algorithms. Spectral clustering, popular in the imaging community, has had some adherents for medium sized data sets in cheminformatics. Self-organizing maps of various forms have become popular in bioinformatics, though originally coming from the engineering literature. They have an analog computing ﬂavor with numerous parameters that vary throughout the clustering process, but it can be shown that they are really quite similar to K-means in their overall behavior. Their popularity is in part due to the fact that they provide a convenient way in which to visualize the results as a part of the process and output.