ABSTRACT

This chapter examines considerations involved in the use of social media as digital research data, from the impact that digital environments and technologies have on cognition and society - in particular with the ways in which language is enacted through the digital - to the importance that technical aspects have when analysing language in social media, thus stressing the importance of access to information and to the code. Legal and ethical issues involved in the use of social media data are then considered in relation to information and code, providing an up-to-date (as of November 2022) overview of the current situation surrounding social media data, framed in a corpus linguistics-informed perspective. The focus then shifts to how corpus linguistics has historically approached digital textual data, starting from the definition of what a corpus is, then moving to the explanation of those layers of information a corpus is usually enriched with that represent contextual and linguistic details that support and enable the analysis of language. These layers, presented as the technical rendition of social media digital technicalities, set the basis for how social media data can be ‘translated’ into a corpus that is able to correctly represent the object of inquiry.