ABSTRACT

The tree structured approach in regression is simpler than in classification. The same impurity criterion used to grow the tree is also used to prune the tree. Besides this, there are no priors to deal with. Use of a stepwise optimal tree structure in least squares regression dates back to the Automatic Interaction Detection (AID) program proposed by Morgan and Sonquist. Regression analysis is the generic term revolving around the construction of a predictor d(x) starting from a learning sample L. Construction of a predictor can have two purposes: (1) to predict the response variable corresponding to future measurement vectors as accurately as possible; (2) to understand the structural relationships between the response and the measured variables. In linear regression the common practice is to use either a stepwise selection or a best subsets algorithm. Since variable selection invalidates the inferential model, stepwise or best subsets regression methods have to be viewed as heuristic data analysis tools.