ABSTRACT

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.1 Introduction It has been said that astronomers have been doing data mining for centuries: “the

data are mine, and you cannot have them!” Seriously, astronomers are trained as data miners, because we are trained to (1) characterize the known (i.e., unsupervised learning, clustering), (2) assign the new (i.e., supervised learning, classification), and (3) discover the unknown (i.e., semisupervised learning, outlier detection) [1,2]. These skills are more critical than ever since astronomy is now a data-intensive science, and it will become even more data intensive in the coming decade [3-5].