Distance metric learning is a fundamental problem in data mining and knowledge discovery, and it is of key importance for many real world applications. For example, information retrieval utilizes the learned distance metric to measure the relevance between the candidate data and the query; clinical decision support uses the learned distance metric to measure pairwise patient similarity [19, 23,24]; pattern recognition can use the learned distance metric to match most similar patterns. In a broader sense, distance metric learning lies in the heart of many data classification problems. As long as a proper distance metric is learned, we can always adopt k-Nearest Neighbor (kNN) classifier  to classify the data. In recent years, many studies have demonstrated [12,27,29], either theoretically or empirically, that learning a good distance metric can greatly improve the performance of data classification tasks.