ABSTRACT

As a cultural, technological, and scholarly phenomenon based on the interplay of algorithmic modelling and pattern analysis, the big data revolution has been a major catalyst in changing the way the relationships within and among pieces of information are now seen and understood. In general terms, theorists and critics 102refer to big data with the umbrella definition of “things one can do at a large scale that cannot be done at a smaller one” (Schonberger and Cukier 2014) in order to extract and uncover hidden patterns capable of revealing new forms of knowledge and value. Needless to say, the versatile potential of big data analytics soon extended its impact beyond the borders of predictive modelling of market trends and customer preferences for which it was developed to conquer the field of literary scholarship. Nowadays, the very existence of billions of bytes of digital information in the form of electronic text archives, along with the web access to this set of data, is forcing scholars to rethink their approach to literary objects, thus allowing the emerging figure of the digital humanist to address questions that were previously inconceivable—inconceivable to a point, however. In fact, as early as 2000. The very idea of a large-scale investigation of the literary system first appeared thanks to Franco Moretti’s (2000) “conjectures” on the notion of distant reading that will be fully developed in his Maps, Graphs, Trees: Abstract Models for a Literary History. Moretti’s distant reading constitutes an alternative approach to the common practice of close text analysis of a few canonical masterpieces in favor of an unusual perspective that takes into account the often ignored 99% of noncanonical works ever published. In his opinion, scholars must widen their horizons to this whole bulk of neglected literature—“the great unread”—so as to surpass the limitations of the traditional methods of reading more or reading closer and embrace the vision of a new quantitative formalism. More than ten years later, in 2013, Matthew Jockers finally gave procedural concreteness to Moretti’s standpoint thanks to the experimental methodology of computer-based macroanalysis. In fact, in his pioneering book Macroanalysis: Digital Methods and Literary History, Jockers explains how computer-based macroanalysis expands the object of study of literary criticism as never before through the application of the statistical tools and algorithmic protocols derived from computational linguistics that enable the quantitative investigation of thousands of books at once.