ABSTRACT

The neural network is essentially an information-processing system in which randomness coexists due to the inevitable presence of noise and, therefore, represents an entropy-dictated system. For the purpose of comparison and discussion, the various error-measures which can be backpropagated towards optimization of gradientdescent learning in neural networks are categorized. A specific class of information-theoretic error-measures are deduced via the so- called Csiszar’s family of cross-entropy functionals. This is the most conventional specification of the neural network error- metric. It represents the mean-square value of the difference between the network output and the teacher (target) value to be minimized so as to achieve optimal network training. The relative “informativity” specified by Csiszar’s formulations refers, in essence, to comparing the of negative entropies associated with the network output and the target value(s).