ABSTRACT

Big Data, big mistakes? The Guardian in 20111 revealed that the US government manipulated social media by creating fake accounts with the aim of doing automatic pro-American propaganda. It is not so uncommon to read about VIPs or politicians having fake followers, like Justin Bieber’s 37 million followers.2 By a small error, it is estimated that 83 million Facebook accounts (8.7% of the total accounts) are fake, half of them just duplicates of the same accounts. In March 2014, the Financial Times3 published an article entitled “Big data: are we making a big mistake?”, focusing on the big fail of the Google Flu Trends experiment – an example already discussed in Chapter 1 (Butler 2013; Cook et al. 2011) – and many other similar cases. These articles report the great disillusion of the naïve idea that the volume of data itself can replace the scientific reasoning (Anderson 2008), like the enthusiasm (Noble 2003) raised by the early completion of Human Genome Project in 2003, that is the mapping of all genes in the human DNA. Despite having the exact mapping of all genes, those data as-is, have no information per se: once the human genome was transcribed, biologists looked at it with no clue of its real meaning. The same applies to the sea of social network discussions: we can download them all, but they appear to be just noise. Having the data and being able to store, manipulate and move them does not coincide with the ability of extracting information from them.