ABSTRACT

This chapter describes the principle of using another popular classification method, based on graph theory - decision trees. Decision trees are among the algorithms that allow the reason for their decision to be relatively comprehensibly determined, that is, why a class was assigned to the element under investigation. In addition, by converting the tree to a set of rules, another form of comprehensible knowledge extracted from the data can be obtained. The basic problem is how to make the appropriate decision tree from empirically obtained data to best classify and have the optimal structure. A method known as c5 was here selected. This method is one of the most used for a very wide range of issues and has reached countless number of successful solutions. The chapter explains how the c5 algorithm uses entropy minimization to build a tree by creating homogeneous sets from the original heterogeneous sets containing a mixture of elements from different classes. Again, the principle of operation is demonstrated by a simple example followed by demonstration of the use of the implementation in R for real data. The resulting tree (or its initial part from the root) is shown, including the form of conversion to rules.