ABSTRACT

The goal of Chapter 13 is to provide methods for supporting the analysis and design of empirical risk functions. Sufficient conditions are provided which that ensures a local minimizer of a smooth empirical risk function converges to a corresponding strict local minimizer of the expected empirical risk function. Maximum likelihood empirical risk functions are shown to be evaluating the likelihood of the observed data given a parameter vector and, additionally, minimizing the cross entropy between the learning machine's probability model and the data distribution. Explicit conditions are provided for ensuring a maximum likelihood empirical risk function is convex on the entire parameter space for exponential family probability models. Next, MAP (Maximum A Posteriori) empirical risk function are introduced for evaluating the likelihood of parameter values given the observed data using the learning machine's probability model and a parameter prior. In addition, it is shown that MAP empirical risk functions are asymptotically equivalent to maximum likelihood empirical risk functions. Methods for interpreting empirical risk functions as MAP empirical risk functions are also discussed. At the end of the chapter, Bayes estimators are introduced and their relationship to MAP empirical risk functions is explored.