ABSTRACT

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679 30.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 680 30.2 Multidimensional Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681

30.2.1 Performance Measures in Information Retrieval . . . . . . . . . . . . . . . . . . 681 30.2.2 Dendrogram to Define the Clusters of Performance Measures . . . . . . . . 682 30.2.3 Principal Component Analysis to Validate the Clusters . . . . . . . . . . . . . 682 30.2.4 3D-Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684

30.3 Graphs and Collaborative Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685 30.3.1 Basis of Collaboration Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687 30.3.2 Geographic and Thematic Collaboration Networks . . . . . . . . . . . . . . . . 689 30.3.3 Large Collaborative Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690 30.3.4 Temporal Collaborative Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691

30.4 Curve Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693 30.4.1 Time Series Microarray Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . 693 30.4.2 Principal Component Analysis to Characterize Clusters . . . . . . . . . . . . 694 30.4.3 Visualizing Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695 30.4.4 Heatmap to Combine Two Clusterings . . . . . . . . . . . . . . . . . . . . . . . . . 696

30.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700

Cluster analysis is a major method in data mining to present overviews of large data sets. Clustering methods allow dimension reducing by finding groups of similar objects or elements. Visual cluster analysis has been defined as a specialization of cluster analysis and is considered as a solution to handle complex data using interactive exploration of clustering results. In this chapter, we consider three case studies in order to illustrate cluster analysis and interactive visual analysis. The first case study is related to information retrieval field and illustrates the case of multidimensional data in which objects to analyze are represented considering various features or variables. Evaluation in information retrieval considers many performance measures. Cluster analysis is used to reduce the number of measures to a small number that can be used to compare various search engines. The second case study considers networks in which data to analyze is represented

of

in the form of matrices that correspond to adjacencymatrices. The data we used is obtained from publications; cluster analysis is used to analyze collaborative networks. The third case study is related to curve clustering and applies when temporal data is involved. In this case study, the application is time-series gene expression. We conclude this chapter by presenting some other types of data for which visual clustering can be used for analysis purposes and present some tools that implement other visual analysis functionalities we did not present in the case studies.