ABSTRACT

The first textbook of its kind, Quantitative Corpus Linguistics with R demonstrates how to use the open source programming language R for corpus linguistic analyses. Computational and corpus linguists doing corpus work will find that R provides an enormous range of functions that currently require several programs to achieve – searching and processing corpora, arranging and outputting the results of corpus searches, statistical evaluation, and graphing.

Acknowledgments

1. Introduction

1.1 Why Another Introduction to Corpus Linguistics?

1.2 Outline of the Book

1.3 Recommendation for Instructors

2. Three Central Corpus-linguistic Methods

2.1 Corpora

2.2 Frequency Lists

2.3 Lexical Co-occurrence: Collocations

2.4 (Lexico-)Grammatical Co-occurence: Concordances

3. An Introduction to R

3.1 A few Central Notions: Data structures, Functions, and Arguments

3.2 Vectors

3.3 Factors

3.4 Data Frames

3.5 Lists

3.6 Elementary Programming Functions

3.7 Character/String Processing

3.8 File and Directory Operations

4. Using R in Corpus Linguistics

4.1 Frequency Lists

4.2 Concordances

4.3 Collocations

4.4 Excursus 1: Processing Multi-tiered Corpora

4.5 Excursus 2: Unicode

5. Some Statistics for Corpus Linguistics

5.1 Introduction to Statistical Thinking

5.2 Categorical Dependent Variables

5.3 Interval/Ratio Dependent Variables

5.4 Customizing Statistical Plots

5.5 Reporting Results

6. Case Studies and Pointers to Other Applications

6.1 Introduction to the Case Studies

6.2 Some Pointers to Further Applications

Appendix

References

Endnotes

Index