Rare Class Learning

doi:10.1201/b17320-22

ABSTRACT

In most real data domains, some examples of normal or abnormal data may be available. This is referred to as training data, and can be used to create a classification model, which distinguishes between normal and anomalous instances. Because of the rare nature of anomalies, such data is often limited, and it is hard to create robust and generalized models on this basis. The problem of classification has been widely studied in its own right, and numerous algorithms are available in the literature [15] for creating supervised models from training data. In many cases, different kinds of abnormal instances may be available, in which case the classification model may be able to distinguish between them. For example, in an intrusion scenario, different kinds of intrusion anomalies are possible, and it may be desirable to distinguish among them.