ABSTRACT

Leo Breiman, known for his work on classification and regression trees and random forests, formalized stacking in his 1996 paper on Stacked Regressions. Although the idea originated in under the name “Stacked Generalizations”, the modern form of stacking that uses internal k-fold cross-validation was Breiman’s contribution. An alternative ensemble approach focuses on stacking multiple models generated from the same base learner. Stacking a grid search provides the greatest benefit when leading models from the base learner have high variance in their hyperparameter settings. The chapter discusses involves performing an automated search across multiple base learners and then stack the resulting models. The biggest gains are usually produced when stacking base learners that have high variability, and uncorrelated, predicted values. The more similar the predicted values are between the base learners, the less advantage there is to combining them.