ABSTRACT

This chapter shows how to use the state estimate for the optimal control of stochastic systems. The deterministic optimal control theory is generally useful only in some ideal noise-free situations where the entire state can be exactly measured. Dynamic programming is based on Bellman's principle of optimality. An optimal policy has the property that no matter what the previous decisions have been, the remaining decisions must constitute an optimal policy with regard to the state resulting from those previous decisions. The chapter shows that the principle of optimality serves to limit the number of potentially optimal control strategies that must be investigated. It also implies that optimal control strategies must be determined by working backward from the final stage; the optimal control problem is inherently a backward-in-time problem, in direct contrast to the optimal filtering problem which is a forward-in-time problem. Dynamic programming can easily be used to find optimal controls for nonlinear stochastic systems.