ABSTRACT

In this chapter, we introduce the framework of least-squares policy iteration. In Section 2.1, we first explain the framework of policy iteration, which iteratively executes the policy evaluation and policy improvement steps for finding better policies. Then, in Section 2.2, we show how value function approximation in the policy evaluation step can be formulated as a regression problem and introduce a least-squares algorithm called least-squares policy iteration (Lagoudakis & Parr, 2003). Finally, this chapter is concluded in Section 2.3.