ABSTRACT

Focusing on the first two decades of the 21st century, this chapter describes what is referred to as the ‘data revolution’. It presents a menu of the large variety of ‘big data’, new data, and new data sources that have surfaced. The discussion includes transactional data, data from log servers, search engines, social media, satellites, and sensors. It also discusses crowdsourced and open data, as new collection and dissemination methods, respectively. The discussion identifies issues associated both with the attributes of the new data and the fact that many emanate from private sources. The exposition then takes head-on the practical implications of the new research design. It challenges popular misconceptions about the superiority of producing old outputs with new data and demonstrates with detailed examples the exact relationship between the old and the new research paradigms, capturing the trade-offs involved in the production of end outputs. It shows that arguments for ‘linear’ substitutions of data are overly simplistic. It reconnects data to their intended uses and offers tips on where to intercept data flows. In the process, it explains clearly why some data exist and others don’t, it proposes to break the taboos around different kinds of data, and offers a new light on data linkages. Fact-checking discusses the Google flu debacle and faulty Facebook data delivered for research.