A Multi-Partition, Multi-Chunk Ensemble for Classifying Concept-Drifti

Chapter

ABSTRACT

This chapter discusses ensemble development and describes the experiments. It is a multiple partition of multiple chunk (MPC) ensemble classifier-based data mining technique to classify concept-drifting data streams. Existing ensemble techniques in classifying concept-drifting data streams follow a single-partition, single-chunk (SPC) approach, in which a single data chunk is used to train one classifier. By introducing this MPC ensemble technique, authors significantly reduce classification error compared to the SPC ensemble approaches. The chapter presents the results of all the five techniques, MPC, Wang, BestL, All, and Last. As soon as a new data chunk appears, authors test each of these ensembles/classifiers on the new data, and update its accuracy, false positive and false negative rates. Although the total training time of MPC is higher than that of Wang, the total testing times are almost the same in both techniques. Wang tends to keep much older classifiers in the ensemble than MPC.

A Multi-Partition, Multi-Chunk Ensemble for Classifying Concept-Drifting Data Streams

ABSTRACT