Get Your Hands Dirty : Emerging Data Practices as Challenge for Research Integrity

doi:10.5117/9789462987173-13

ABSTRACT

In November 2014 two interns (the first two authors of this chapter listed above) at the Utrecht Data School started investigating an online discussion forum for patients under the supervision of Mirko Tobias Schäfer (this essay’s third author). Without his knowledge and without any prior knowledge of scraping websites, the two students downloaded 150,000 patient profiles (which included, amongst other information, age, location, diagnoses and treatments related to these patients), using a (90-euro) off-the-shelf scraper tool ¹ , without informing these patients or requesting consent from them or the platform providers. The plan to first explore the data (taking the necessary precautions to keep the data confidential) and later, after formulating a research question and hypothesis, to ask permission to conduct in-depth analysis of data relevant for our research, was never realized. After a few days of acting like ‘information flâneurs’ (Dörk et al. 2011), browsing through the data without specific questions or goals in mind, we were notified that our department’s supervisors had terminated the project due to concerns about research ethics. ² Their decision prompted us to rethink our actions and to question our research practices as well as existing research standards. Assuming that the rather novel data sources and practices of analysis were disrupting the traditional research process and contradicting established guidelines in research ethics, we found that these events provided the inspiration to revisit research ethics concerning big data research.