ABSTRACT

This paper examines how and why empirical results related to first-word acquisition in infants can occur in a generic associative PDP model. During learning, a network is exposed to a micro-world composed of categories made of clusters of “images” and of labels attached to these clusters. The architecture of the network allows encoding of labels and images in a common level of representation and subsequent extraction of labels from images and images from labels. If (1) the learning rule is an error-correction/steepest descent algorithm, (2) the image clusters are sufficiently “fuzzy”, (3) the mapping image/label is consistent and (4) the network capacity is adapted to the size of the micro-world, this simple generic model can be shown to account for a broad spectrum of first-word acquisition data including acquisition “burst”, underextensions, overextensions, gradual generalization, comprehension before production and decontextualization.