ABSTRACT

These algorithms deal with the problem of classifying an unlabeled data point given a set of labeled data. The labeled data set may come from original labeled data or it could be labeled using a clustering method. The following algorithms are described: k-nearest neighbor, naive Bayes classifier, decision trees, and logistic regression. Other classification techniques, such as, artificial neural networks, support vector machines, and hidden Markov models are described in subsequent Chapters. Classification algorithms can also be used for regression, which is also addressed in this Chapter. A set of exercises and a classification project is given at the end of the Chapter.