ABSTRACT

After generating a set of base learners, rather than trying to find the best single learner, ensemblemethods resort to combination to achieve a strong generalization ability, where the combination method plays a crucial role. Dietterich [2000a] attributed the benefit from combination to the following three fundamental reasons:

• Statistical issue: It is often the case that the hypothesis space is too large to explore for limited training data, and that theremay be several different hypotheses giving the same accuracy on the training data. If the learning algorithm chooses one of these hypotheses, there is a risk that a mistakenly chosen hypothesis could not predict the future data well. As shown in Figure 4.1(a), by combining the hypotheses, the risk of choosing a wrong hypothesis can be reduced.