ABSTRACT

With a small sample size either the model complexity is reduced, or expert knowledge and external data should replace some of the data-dependent decisions, or both. We go from the setting where the computer does not make a single decision to the extreme where the computer makes almost all decisions. This chapter suggests the use of cross-validation as a mandatory tool also for prediction model building. Nested layers of cross-validation are needed when we use internal cross-validation to assess the prediction performance of a modeling algorithm which uses cross-validation to build the model. For each combination of the numbers of knots and the penalty parameter, we calculate the cross-validated Brier score and choose the combination and corresponding penalized regression model with the best prediction performance. A typical machine learning approach uses bootstrap or cross-validation to tune an algorithm that learns from the data.