Statistical Physics Models of Supervised Learning

doi:10.1201/9780429492525-6

ABSTRACT

This chapter reviews the general statistical mechanics formulation of learning from examples in large layered networks. It focuses on some new perceptron models, learning binary weights from noisy patterns, and learning with binary weights when the input architecture is undetermined. The emergence of very large parametric models in machine learning suggests a natural analogy between the model parameter space and the configfuration space of complex physical systems. The currently dominant approach in computational learning theory is based on Valiant's learning model and on the notion of Probably Almost Correct learning. A simple and interesting limit of the learning theory is that of temperatures. Understanding the learning with noisy patterns allows us to study an even more interesting learning model, learning when the architecture of the student doesn't match the one of the teacher. The interesting aspect of the model is the ability to study the generalization problem when the number of input nodes is not fixed.