ABSTRACT

In this book, we assume that each instance in the training set is represented as a pair of vectors, one for the input features and the other for the output labels. Multilabel learning concerns the prediction of the labels of unseen instances by building a classifier based on the training data. Formally, let X and Y denote the input instance space and the output label space, respectively. In multi-label learning, the label space Y is defined as Y = {0, 1}k, where k is the number of labels. That is, the jth component of the label vector is 1 if the instance is relevant to the jth label, and it is 0 otherwise. Similar to traditional classification, given a training data set, the goal of multi-label learning is to learn a classifier f : X → Y , which predicts the labels of each instance x ∈ X . Specifically, the output of the classifier f for a given instance x ∈ X is

f(x) = [f1(x), f2(x), · · · , fk(x)]T , (1.1) where fj(x) (j = 1, · · · , k) is either 1 or 0, indicating the association of x with the jth label. In the following, the set of labels is denoted as L = {C1, · · · , Ck}.