ABSTRACT

The simplest learning rule is named after the psychologist Donald Hebb, who formulated an intuitive, physiologically based version of it in a work published in 1949. Hebb proposed that knowledge was stored in “cell assemblies”, connected groups of neurons that would activate each other. Faced with the problem of how such assemblies could form on the basis of experience, Hebb proposed that if two cells that were connected repeatedly fired at the same time, then the strength of the connection between them (and hence the ability of one cell to activate the other) would increase. Thus, things that go together in one’s experience (e.g. the face and voice of a friend) become associated, such that experiencing one thing can activate, or bring to mind, the other. In connectionist terms, the basic rule has the form:

∆wij = ai × aj (7)

where ai and aj are the activations of two units connected by the weight wij. Hence, the weight increases (i.e. has a positive change) whenever the two units are active at the same time (if either unit has zero activation, then clearly ∆ wij = 0, and the weight does not change). In neural terms, the rule can be thought of as stating that the strength of (excitatory) synapses should increase whenever two neurons connected by such synapses fire (become depolarized) at the same time (or at least very closely in time). The phenomenon of long-term potentiation (LTP) of synapses has this characteristic (see e.g. Carlson, 2001, Chapter 14). The rule can be used to perform associative learning between mental entities that are active at the same time. For instance, Fig. 1.8 shows nodes from two layers of a simple reading model (e.g. McClelland & Rumelhart, 1981). In one layer each node stands for a letter (in a given position); in the upper layer each node stands for a word. If the letter nodes for “C”, “A” and “T” are activated at the same time as the word node “Cat”, then the weights on the connections (shown as solid, bold lines) between these units will be strengthened. Thus, the next time the same pattern of letter nodes is activated (without independent activation of “Cat”), the spread of activation from the letters to the word will be more likely to be able to activate the word (i.e. the letter pattern will be recognized as forming a known word).