ABSTRACT

CONTENTS 28.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 28.2 First Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 28.3 Entropic Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476 28.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478 28.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484

Filtering relevant and intelligible information via quantitative measures of quality remains one of the most sensitive phases of a rule mining process. In order to explicitly take the dataset sizes into account contrary to the classical confidence, and also to highlight the “natural” non-symmetrical feature of the implication notion, Gras has defined the implication intensity which measures the statistical surprisingness of the discovered rules. However, like numerous measures of the literature, this latter does not take into account the contrapositive b ⇒ a which could yet allow to reinforce the assertion of the implication between a and b. Here, we introduce a new measure based on the Shannon entropy to quantify the imbalances between examples and counter-examples for both the rule and its contrapositive. Numerical comparisons of this measure with the confidence and Loevinger’s index are given on both synthetic databases and real data of various types from human resource management and from lift breakdowns. We compare the statistical distributions of the number of rules retained by these indexes and underline the interest, in a decision process, of rules having a different behavior depending on the chosen measure.