ABSTRACT

In this chapter, the relationship between digital humanities, information science and lexicography is analysed, exploring new possibilities and challenges and revisiting lexicographic theories and practices resulting from the need to digitise dictionaries. Methodology applied to Vocabulário Ortográfico da Língua Portuguesa (Orthographic Vocabulary of the Portuguese Language; VOLP-1940) is described, aiming at promoting accessibility to cultural heritage content while fostering reusability. VOLP-1940 is the first orthographic vocabulary published by the Academia das Ciências de Lisboa (ACL), in 1940. The goal is to analyse the vocabularies with computational methods to better assess the importance of this work for the evolution of science and mentalities in the 20th century and to contribute to the current movement of creating innovative, data-driven computational methods for text digitisation, encoding and analysis. VOLP-1940's digitisation aims to create a lexicographical resource encoded in TEI (Text Encoding Initiative), with structured information in SKOS (Simple Knowledge Organisation System), to guarantee its future connection to other systems and resources. This trans-disciplinary approach combines theories and methods of linguistics, lexicography and information science, placing the TEI and SKOS standards at the core of this research. This research also aims to fill in a gap in Portuguese lexicography, given that legacy dictionaries are still rare online.