ABSTRACT

We are given a training set D = {(xi, yi), i = 1, · · · , n : xi ∈ X ⊂ IRp, yi ∈ IR}, where the yi’s are realizations of Yi = f∗(xi)+ǫi, with the ǫi’s representing the noise terms, herein assumed to be independently normally distributed with

in Bayesian

variance σ2. Our goal is to use the information contained in the data D to build an estimator fˆ of the true unknown function f∗ that achieves the smallest mean squared error all over X ×Y, or more specifically.