Greedy Search Methods
This chapter discusses greedy search methods such as simple univariate filters and recursive feature elimination. The most basic approach to feature selection is to screen the predictors to see if any have a relationship with the outcome prior to including them in a model. The Parkinson’s disease data has several characteristics that could make modeling challenging. The predictors have a high degree of multicollinearity, the sample size is small, and the outcome is imbalanced. For the Parkinson’s data, backwards selection was conducted with random forests, and each ensemble contained 10,000 trees. The model’s importance scores were used to rank the predictors. Recursive feature elimination can be an effective and relatively efficient technique for reducing the model complexity by removing irrelevant predictors. Stepwise selection was originally developed as a feature selection technique for linear regression models. Stepwise is a less greedy method than other search methods since it does reconsider adding terms back into the model that have been removed.