ABSTRACT

The basic idea is that by having lots of learners that each get slightly different results on a dataset—some learning certain things well and some learning others—and putting them together, the results that are generated will be significantly better than any one of them on its own. There is a very extreme form of boosting that is applied to trees. It goes by the descriptive name of stumping. If there is one method in machine learning that has grown in popularity, then it is the idea of random forests. Bagging puts most of its effort into ensuring that the different classifiers see different data, since they see different samples of the data. There are some obvious similarities to boosting, but it is the differences that are most telling.