ABSTRACT

FIGURE 4.12: The PR (left) and cumulative recall (right) curves for the two versions of Naive Bayes and ORh methods.

Given these results, we might question whether the fact that we have not split the model construction by product, as done in the unsupervised methods, may be causing difficulties with this model. As an exercise you can try to follow this approach with Naive Bayes. You need to adapt the code used for the unsupervised models that splits the transactions by product to the Naive Bayes model. An additional difficulty that you will meet, if you decide to carry out this exercise, is the fact that you will have very few supervised reports by product. In effect, even without the restriction of being labeled, we have observed that several products have too few transactions. If we add the restriction of only using the labeled transactions, this problem will surely increase.