ABSTRACT

The methodological development produced some features that were added to the basic tree structure to make it more flexible, powerful, and efficient. In some data, the classes are naturally separated by hyperplanes not perpendicular to the coordinate axes. These problems are difficult for the unmodified tree structured procedure and result in large trees as the algorithm attempts to approximate the hyperplanes by multidimensional rectangular regions. To cope with such situations, the basic structure has been enhanced to allow a search for best splits over linear combinations of variables. Another class of problems which are difficult for the basic tree procedure is characterized by high dimensionality and a Boolean structure. The Boolean combination method preserves the invariance of the tree structure under monotone univariate transformations and maintains the simplicity of interpretation. The term feature is used in the pattern recognition literature to denote a variable that is manufactured as a real-valued function of the variables that were originally measured.