Model approximation in dynamic programming – general theory | 18

ABSTRACT

In this chapter we consider the problem of determining approximation bounds on attained costs which result from replacing a model with either a single approximation, or a sequence of approximations. We are given model π= (R,Q), satisfying definitions (M1)–(M6) of Section 12.1, where Q may be a single or multiple kernel, and so both policy evaluation and dynamic programming are of interest. If the model were known, a VIA Vk+1 = T¯πVk would be employed.