ABSTRACT

This chapter extends discussion on predictive modeling to include many other models that are not based on regression. The models authors explore in this chapter are not necessarily continuous, nor are they necessarily expressed as parametric functions. A decision tree is a tree-like flowchart that assigns class labels to individual observations. Each branch of the tree separates the records in the data set into increasingly “pure” subsets, in the sense that they are more likely to share the same class label. A natural extension of a decision tree is a random forest. A random forest is collection of decision trees that are aggregated by majority rule.