ABSTRACT
In this section, we introduce you to topic modelling with Latent Dirichlet Allocation (LDA). LDA is a generative model which relies on the bag-of-words approach. LDA assumes that the distribution of topics across documents and tokens results from an underlying process, giving rise to abstracts topics. Against this backdrop, this chapter aims to equip you with the knowledge to understand the LDA algorithm, preprocess the text corpus, automatize topic extraction, select topics for interpretation, calculate and visualize goodness of fit measures, and how to effectively interpret topics.
