ABSTRACT

Customer churn prediction and retention is a major issue for various service based organizations and skewed data representation presents significant challenges for such problems. The class imbalance exists when the number of samples of one class is much lesser than the ones of the other classes. In machine learning, the data-level approaches are known to handle the class imbalance problem. This paper provides a comprehensive study on different machine learning algorithms applied for customer churn prediction. The key focus of this paper is to present the current literature pertaining to data imbalance problem in classification, handling the skewed customer churn data and data-level techniques used to overcome the imbalanced distribution of data. Finally, we uncover number of research implications and upcoming directions for regular, big data and deep machine learning.