ABSTRACT

Information gain is based on well-known information theory measure entropy, which characterizes the purity of an arbitrary collection of items and is considered as a metric of system’s unpredictability. Information gain measuring the expected reduction of entropy caused by partitioning the examples according to feature A is given by:

1 INTRODUCTION

Credit risk management has played a key role in financial and banking industry. An accurate estimation of risk, and its use in corporate or global financial risk models, could be translated into a more efficient use of resources. One important ingredient to accomplish this goal is to find useful predictors of individual risk in the credit portfolios of institutions. However, accessing credit risk is very challenging because many factors may contribute to the risk and their relationship is very complicated to capture and measure. Recent years have witnessed a growing trend in applying machine learning methods for credit risk analysis. These methods can automatically learn from historical data, and yield highly accurate predictions in many practical tasks. One powerful machine learning method is Support Vector Machine (SVM) [1].