ABSTRACT

This chapter explains several information criteria from the two viewpoints, model selection and hyperparameter optimization. It investigates the properties of the generalization loss and the free energy or the minus log marginal likelihood in each viewpoint and discusses a model selection problem. There are two methods in model selection, minimizing the generalization loss and the free energy. The chapter provides the definitions of several information criteria which are used for estimation of the generalization loss. Since the generalization losses for Bayesian, maximum likelihood, maximum a posteriori, and posterior mean methods are different. A parameter of a prior distribution is called a hyperparameter.