Machine Learning and the Sentence Acceptability Task | 3

ABSTRACT

This chapter considers the connection between grammaticality, acceptability, and probability. The use of neural language models in natural language processing tasks that require syntactic knowledge raises the question of the relationship between grammaticality and probability. Grammaticality is a theoretical property, which is not directly accessible to observation. Speakers' acceptability judgements can be observed and measured. These judgements provide the primary data for most linguistic theories. An adequate theory of syntactic knowledge must be able to account for the observed data of acceptability judgements. The chapter focuses on a series of experiments in which the acceptability predictions of unsupervised language models are evaluated through Pearson correlations with mean human ratings obtained through Amazon Mechanical Turk crowd-source annotation. Infelicities are introduced into some of the test sets through round-trip machine translation. Gibbs sampling is a Markov Chain Monte Carlo procedure for inferring a sequence of approximate observations from a multivariate probability distribution.