Neural Networks | 19 | v2 | Extending the Linear Model with R

ABSTRACT

Neural networks (NN) were originally developed as an attempt to emulate the human brain. The brain has about 1.5×1010 neurons each with 10 to 104 connections called synapses. The speed of messages between neurons is about 100 m/sec which is much slower than CPU speed. Given that our fastest reaction time is around 100 ms and neuron computation time is 1-10 ms, then the number of steps must be less than 100. This is inconceivably small for a sequential computation, even in machine code; therefore, the brain must be computing in parallel. The original idea behind neural networks was to use a computer-based model of the human brain to perform complex tasks. We can recognize people in fractions of a second, but this task is difficult for computers. So why not make software more like the human brain? Despite the promise, there are some drawbacks. The brain model of connected neurons, first suggested by McCulloch and Pitts (1943), is too simplistic given more recent research. For these and other reasons, the methodology is more properly called artificial neural nets. As with artificial intelligence, the promise of NNs is not matched by the reality of their performance. There was a large amount of hype concerning NNs but that is now past. Nevertheless, viewed simply as an algorithmic procedure, NNs are competitive with other less ambitiously named methods. NNs are used for various purposes. They can be used as biological models, which was the original motivation. They can also be used as a hardware implementation for adaptive control. But the area of application we are interested in is data analysis. There are NN models that rival the regression, classification and clustering methods normally used by statisticians. A perceptron is a model of a neuron and is the basic building block of a neural network as depicted in Figure 17.1. The output xo is determined from inputs xi:

Input

Output

where fo is called the activation function. Standard choices include the identity, logistic and indicator functions. The wi are weights. The NN learns the weights from the data. A statistician would prefer to say that the NN estimates the parameters from the data. Thus NN terminology differs from statistical usage in ways that can be confusing.