Text Mining and Topic Modeling

doi:10.4324/9781003025245-24

ABSTRACT

In the digital age, texts are available in unprecedented quantities. Since the beginning of textual mass production with Gutenberg's printing press (arguably a decisive step in the development of a differentiated modern society (Luhmann, 1995)), the rise of literacy, and, more recently, the inventions of personal computers and the World Wide Web, we have seen a sheer explosion of textual information. Nowadays, everybody is constantly exposed to text, and most people are producers themselves. Textual traces are therefore ubiquitous and can be found in blogs and articles on the web, social media posts and comments, messages of all sorts, e-books, and digital representations of vast (physical) libraries. As a persistent, ubiquitous feature of social life in contemporary societies, textual data constitutes an important part of computational social science (Lazer et al., 2009; Heiberger & Riebling, 2016).