ABSTRACT

In recent years, there has been an explosion in the growth of databases in all areas of human endeavor. Progress in digital data acquisition and storage technology has resulted in the growth of huge databases. In this work, we address the feature selection issue under a classification framework. The aim is to build a classifier that accurately predicts the classes of new unlabeled instances. Theoretically, having more features and instances should give us more discriminating power. However, this can cause several problems: increased computational complexity and cost; too many redundant or irrelevant features; and estimation degradation in the classification error.