Linear Discriminants | 6 | Machine Learning | Stephen Marsland

ABSTRACT

In the last chapter we saw a simple model of a neuron that simulated what seems to be the most important function of a neuron-deciding whether or not to fire-and ignored the nasty biological things like chemical concentrations, refractory periods, etc. Having this model is only useful if we can use it to understand what is happening when we learn, or use the model in order to solve some kind of problem. We are going to try to do both in this chapter, although the learning that we try to understand will be machine learning rather than animal learning. One thing that is probably fairly obvious is that one neuron isn’t that

interesting. It doesn’t do very much, except fire or not fire when we give it inputs. In fact, it doesn’t even learn. If we feed in the same set of inputs over and over again, the output of the neuron never varies-it either fires or does not. So to make the neuron a little more interesting we need to work out how to make it learn, and then we need to put sets of neurons together into neural networks so that they can do something useful. The question we need to think about first is how our neurons can learn.

We are going to look at supervised learning for the next few chapters, which means that the algorithms will learn by example: the dataset that we learn from has the correct output values associated with each datapoint. At first sight this might seem pointless, since if you already know the correct answer, why bother learning at all? The key is in the concept of generalisation that we saw in Section 1.2. Assuming that there is some pattern in the data, then by showing the neural network a few examples we hope that it will find the pattern and predict the other examples correctly. This is sometimes known as pattern recognition. Before we worry too much about this, let’s think about what learning is.

In the previous chapter it was suggested that you learn if you get better at doing something. So if you can’t program in the first semester and you can in the second, you have learnt to program. Something has changed (adapted), presumably in your brain, so that you can do a task that you were not able to do previously. Have a look again at the McCulloch and Pitts neuron (e.g., in Figure 1.6) and try to work out what can change in that model. The only things that make up the neuron are the inputs, the weights, and the threshold (and there is only one threshold for each neuron, but lots of inputs). The inputs can’t change, since they are external, so we can only change the weights and the threshold, which is interesting since it tells us that most of

the learning is in the weights, which aren’t part of the neuron at all; they are the model of the synapse! Getting excited about neurons turns out to be missing something important, which is that the learning happens between the neurons, in the way that they are connected together. So in order to make a neuron learn, the question that we need to ask is: How should we change the weights and thresholds of the neurons so that the

network gets the right answer more often? Now that we know the right question to ask we’ll have a look at our very

first neural network, the space-age sounding Perceptron, and see how we can use it to solve the problem (it really was space-age, too: created in 1958). Once we’ve worked out the algorithm and how it works, we’ll look at what it can and cannot do, and then see how statistics can give us insights into learning as well.