ABSTRACT

Anomaly detection is the identification of suspicious data patterns in the normal flow of a network that differ from the majority of data. Detection can help to prevent major attacks on an organization. However, it is difficult to prepare a model that performs well in a zero-day attack situation because obtaining the required anomaly-containing data to train the model is complex, as both attacks and anomalies vary according to the situation. The deep learning model-based approach taps the network to capture freely available data, acquiring a .pcap file from which features can be extracted using a Python-based script. The Gaussian distribution probabilities of each packet can then be computed because an anomaly-containing packet will deviate from the normal distribution. The relevant features and their mapping values are extracted from raw packets and captured in a CSV file format to act as an input tensor for the deep learning model. The output of the model gives the probabilities of normal and anomalous behaviour of the packet. The accuracy score is calculated using a statistical joint probability-based approach, which assumes all the features are independent, and the anomaly is verified by comparing the probabilities to a specific threshold.