ABSTRACT

We consider nonparametric variable selection using Bayesian decision tree ensembles, primarily from a fully-Bayesian perspective. The variable selection problem for Bayesian decision trees seems, on the surface, to be very simple --- if a variable has been used as a branch in a decision tree, then it is in the model, and otherwise it is not. We show that the situation is more subtle than this, with poorly chosen priors resulting in overly-dense models. We review methods that do not use the model itself to determine variable importance and fully-Bayes approaches which derive model inclusion probabilities from the model itself. Fully-Bayesian approaches rely on sparsity-inducing priors to perform well; such priors include the Dirichlet prior, Spike-and-Forest priors, and a new class of Gibbs priors which we introduce. We illustrate these approaches on simulated and benchmark datasets. We also discuss extensions of these approaches when there exists additional structure, such as graphical or grouping structure, on the predictors.