ABSTRACT

Feature selection is one of the fundamental problems in machine learning. The role of feature selection is critical, especially in applications involving many irrelevant features. Yet, compared to classifier design (e.g., SVM and AdaBoost), much rigorous theoretical treatment to feature selection is needed. Most feature selection algorithms rely on heuristic searching and thus cannot provide any guarantee of optimality. This is largely due to the difficulty in defining an objective function that can be easily optimized by well-established optimization techniques. It is particularly true for wrapper methods when a nonlinear classifier is used to evaluate the goodness of selected feature subsets. This problem can to some extent be alleviated by using a feature-weighting strategy, which assigns to each feature a real-valued number, instead of a binary one, to indicate its relevance to a learning problem. Among the existing feature weighting algorithms, the Relief algorithm [10] is considered one of the most successful ones due to its simplicity and effectiveness [5]. However, it is unclear to date what objective function Relief optimizes. In this chapter, we first prove that Relief implements an online algorithm that solves a convex optimization problem with a margin-based objective function. The margin is defined based on a 1-NN classifier. Therefore, compared with filter methods, Relief usually performs better due to the performance feedback of

a nonlinear classifier when searching for useful features; and compared with conventional wrapper methods, by optimizing a convex problem, Relief avoids any exhaustive or heuristic combinatorial search and thus can be implemented very efficiently. The new interpretation clearly explains the simplicity and effectiveness of Relief.