ABSTRACT

This chapter presents a case study wherein five data mining models are designed to predict the survival rate of oral cancer patients who visit the Ear, Nose and Throat-Out Patients Department (ENT-OPD). The predictive models are single tree, TreeBoost, decision tree forest, multilayer perceptron, and support vector machine that address the classification problem. This study helps identify the most effective model for predicting the survivability of oral cancer by examining 1025 patients who visited a tertiary care center from January 2004 to December 2009. For all these models, there is no misclassified row in any category, and all cases have correctly been classified. The performance of the models are estimated on the basis of validation method, misclassification table, confusion matrix, sensitivity and specificity report, lift and gain chart and area under ROC curve. The diagnostic biopsy has been identified as the most important attribute by all these models. All the models present similar results in terms of misclassification statistics, confusion matrix, sensitivity, and specificity as well as lift and gain. The single tree model takes the minimum time for analysis. The experimental results in terms of probability calibration and threshold analysis are better in the support vector machine model, thus making it the most favorable model for predicting the survival rate of oral cancer patients.