ABSTRACT

Supervised learning refers to training a machine with labeled data to predict correct outcome for unseen data. Classification and regression are two major approaches for supervised learning. Coefficient of variation (CV) can be used to develop models for both of these approaches. The chapter demonstrates the potential characteristics of CV for building supervised learning models. It introduces CV Gain in lines of Information Gain for attribute selection and demonstrates decision tree/regression tree induction. The merits of CV Gain computation makes it highly suitable for Big Data environment as well as applications where the decision can be fuzzy. CV Gain of an attribute is computed with the help of its Conditional CV. Conditional CV explains the dispersion of the decision attribute with respect to an independent attribute. CV is a measure of dispersion or variation of an attribute.