ABSTRACT

Canonical Correlation Analysis (CCA) is a classical technique for finding the correlations between two sets of multi-dimensional variables [132]. CCA makes use of two views of the same set of objects and projects them onto lower-dimensional spaces in which they are maximally correlated. It has become a powerful tool to analyze so-called paired data (X,Y), where X and Y are two different representations of the same set of objects [210]. Such a scenario arises in many real-world applications. For example, in parallel corpus [139], there are texts in two languages that are similar in content, which can be considered as the paired data sets. In content-based image retrieval [105], the image and the associated texts can be considered as two different views of the same semantic representation.