ABSTRACT

As you will have inferred from this book so far, I am very much in favor of the idea that corpus linguists should not be dependent on a limited set of tools such as ready-made corpus software, which, by denition, comes with limitations: Not everything a (corpus) linguist may want to do can be readily implemented in them. But my reluctance against being dependent on ‘things’ goes further than that, which is why this book has only used freely available software and in fact mostly open-source software: All the code was primarily developed on Linux (though tested on Windows as well), the spreadsheet software used was from the LibreOfce suite etc. A nal implication of this worth mentioning here briey is that, with very few exceptions, this book has taught you how to do things relying as much as possible on just base R, i.e., relying on extra packages as little as possible – the exceptions are the packages for processing XML data and dplyr for the %>% operator, but even there I often show how things can be done without them. This is, again, just so that researchers don’t become overly reliant on a particular package’s functionality, which may change, be discontinued, etc. – I prefer researchers in the driving seat, as when they are when they are able to develop the functionality they need with the functions in ‘base R’ because then they, as do we all, know best what they’re doing.