ABSTRACT

Model selection is not always necessary. Sometimes there is one established model derived from physical theory or empirical experience. In other cases, the readers commit to a particular model as part of a designed experiment. Also, if the number of predictors is relatively small compared to the number of observations, there may not be much benefit from model selection and it might be justifiable not to trouble with it. Model selection is a process that should not be separated from the rest of the analysis. Other parts of the data analysis can have an impact. In the context of model selection, a model which is too small will tend to have biased predictions because it is insufficiently flexible to represent the relationship between the predictors and the response. Harrell has extensive advice on model validation using sample splitting and cross-validation. Steyerberg also provides advice on model selection and validation in clinical applications.