ABSTRACT

In this part, we discussed our proposed data mining technique to detect email worms. Di¡erent features, such as total number of words in message body/subject, presence/absence of binary attachments, types of attachments, and others, are extracted from the emails. en the number of features is reduced using a Two-phase Selection (TPS) technique, which is a novel combination of decision tree and greedy selection algorithm. We have used di¡erent classi’cation techniques, such as Support Vector Machine (SVM), Naïve Bayes, and their combination. Finally, the trained classi’ers are tested on a dataset containing both known and unknown types of worms. Compared to the baseline approaches, our proposed TPS selection along with SVM classi’cation achieves the best accuracy in detecting both known and unknown types of worms.