Discriminative Topic Models

doi:10.1201/b11580-4

ABSTRACT

This chapter discusses discriminative latent Dirichlet allocation (DLDA), which is a discriminative topic model combining multiclass logistic regression with latent Dirichlet allocation (LDA). Since DLDA allows the number of components k in the mixed membership to be different from the number of classes c, the model often discovers additional latent structure beyond what is implied by the class labels with a larger k. Several models for analyzing topics from document collections have been proposed. Among the algorithms, LDA is one of the most widely used topic models. Supervised latent Dirichlet allocation (SLDA) is such a topic modeling algorithm that takes response variables into account. However, the response variable in SLDA is a real number, which is assumed to be generated from a normal linear model, so it is different from categorical labels in the context of classification. SLDA is an extension of LDA that accommodates the response variables other than the documents.