Validating and tuning | 13 | Machine Learning for Factor Investing: R

ABSTRACT

For some models (neural networks and boosted trees), the number of degrees of freedom is so large that finding the right parameters can become complicated and challenging. This chapter addresses these issues but the reader must be aware that there is no shortcut to building good models. Crafting an effective model is time-consuming and often the result of many iterations. The parameter values that are set before training are called hyperparameters. In order to be able to choose good hyperparameters, it is imperative to define metrics that evaluate the performance of ML models. The performance metrics for categorical outcomes are substantially different compared to those of numerical outputs. A large proportion of these metrics are dedicated to binary classes, though some of them can easily be generalized to multiclass models. The chapter presents the concepts pertaining to these metrics in an increasing order of complexity and starts with the two dichotomies true versus false and positive versus negative.