ABSTRACT

Gradient boosting machines (GBMs) are an extremely popular machine learning algorithm that have proven successful across many domains and is one of the leading methods for winning Kaggle competitions. This chapter discusses the fundamentals to understanding and implementing some popular implementations of GBMs. A simple GBM model contains two categories of hyperparameters: boosting hyperparameters and tree-specific hyperparameters. Dropout can also be used to address overfitting in GBMs. When constructing a GBM, the first few trees added at the beginning of the ensemble typically dominate the model performance while trees added later typically improve the prediction for only a small subset of the feature space. Measuring GBM feature importance and effects follows the same construct as random forests. GBMs are one of the most powerful ensemble algorithms that are often first-in-class with predictive accuracy. Several supervised machine learning algorithms are based on a single predictive model, for example: ordinary linear regression, penalized regression models, single decision trees, and support vector machines.