ABSTRACT

This chapter discusses unsupervised data mining techniques, which include association rule mining, clustering and anomaly detection, with their security applications and adaptations. Data mining is the science of extracting patterns from data. The goal is to extract interesting, nontrivial and useful patterns. The use of data mining can lead to unexpected violations of privacy, since it can find relationships between data items, which can defeat simple anonymization techniques. Data exploration is the process of getting to know the data and assessing its quality. Data normalization is a very important step in the preprocessing phase. An important issue about such encoding schemes is known as the dummy variable trap. A serious limitation of association rule mining is the constraint of binary data. Clustering is a popular, exploratory, and unsupervised technique whose goal is to group similar objects together into a cluster and keep dissimilar objects apart in different clusters.