ABSTRACT

In this chapter, we present a novel method to reduce the incidence of false negatives in the clustering of malware detected during drive-by download attacks. Our method comprises use of a high-interaction client honeypot called Capture-HPC to acquire behavioral system and network data, and application of clustering analysis. Our method addresses various issues in clustering, including (1) finding the number of clusters in a data set, (2) finding good initial centroids, and (3) determining the relevance of each of the features at each cluster.