ABSTRACT

Outlier detection, removal, and analysis are important steps in any data cleansing process under an application. Many log files and records generated and at the same time, it becomes important to remove outlier which is irrelevant in the dataset. It is a data or object which is beyond the scope of probable data. It is different than the normal existing data that needs to be detected and removed because the presence and inclusion of outliers data in calculation change the whole result and value. In this paper, we have worked and identified outliers in any healthcare sector of big data-based applications. It is also called anomaly detection. This paper describes the introduction of outliers, related work, challenges of outlier’s detections, conclusion and future work.