ABSTRACT

This chapter reports the results of a system-centric evaluation of Grammarly as used in research paper revision. It first presents the overall precision of flagging (i.e., the proportion of all Grammarly-flagged usages that were indeed erroneous), precision of suggestion (i.e., the proportion of all Grammarly-provided correction suggestions that were appropriate), and recall (i.e., the proportion of all errors that were identified by Grammarly). Then it presents the precision and recall values for the four correctness-related error types in Grammarly, i.e., conventions, grammar, punctuation, and spelling. After that, the chapter further categorizes, within each error type, the inaccurate flag, inaccurate suggestion, and missed error cases; reports the frequency for each subcategory; and discusses possible reasons for these mistakes on the part of Grammarly.