ABSTRACT

This chapter presents a number of large-scale investigations of millions of articles of online news content. It aims to demonstrate how automated approaches can access and analyzes both semantic and stylistic properties of content, thereby opening up the possibility of transforming the analysis of media content to cover huge data sets. The chapter introduces sophistication into machine-based forms of analysis; it gives us access to huge samples, allowing the kinds of comprehensive, longitudinal forms of analysis often beyond the reach of conventional content analysis. It explores the application of modern Artificial Intelligence (AI) techniques, including data mining, machine learning, natural language processing, and computer vision for the large-scale automated analysis of news media content. In order to demonstrate these approaches, the chapter also presents three areas of analysis—writing style, gender representation, and narrative analysis—with some fairly predictable outcomes. It focuses on state-of-the-art AI techniques including data mining, statistical machine learning, and natural language processing.