Adaptive Dynamic Programming and Optimal Control | 2

ABSTRACT

This chapter delves deep into the foundational concepts of neural network (NN)-based approximation and its application in optimal control. The core focus here is the integration of these fundamentals into the broader framework of event-based optimal control, a topic to be thoroughly explored in the subsequent chapters. We shall begin with the notion of function approximation, a significant feature of neural networks. Alongside this, the principles of dynamic programming and its adaptive variant, referred to as adaptive/approximate dynamic programming (ADP), shall be examined in both discrete and continuous time domains. Neural networks, particularly those with multiple layers, are universally acknowledged for their capacity to approximate system dynamics. This inherent property facilitates the computation of control policy (a process known as indirect adaptive control of nonlinear systems) or enables the direct approximation of control policies. We shall briefly review the journey of neural network (NN) evolution, taking a deep dive into their mathematical structures, universal approximation property, and training mechanisms. Following the discourse on NNs, we shall focus on optimal control theory. This field of mathematical optimization strives to synthesize a control policy that optimizes a specified objective function and helps steer dynamical systems over time. We shall see that the "curse of dimensionality," a notorious issue that precludes dynamic programming from being employed in large state space, has led to the advent of Adaptive Dynamic Programming (ADP). We shall see that, as a unifying approach, %that combines adaptive control, dynamic programming, and reinforcement learning, ADP allows for the design of optimal control solutions without comprehensive knowledge of system dynamics.

This chapter delves deep into the foundational concepts of neural network (NN)-based approximation and its application in optimal control. The core focus here is the integration of these fundamentals into the broader framework of event-based optimal control, a topic to be thoroughly explored in the subsequent chapters. We shall begin with the notion of function approximation, a significant 48feature of artificial NNs. Alongside this, the principles of dynamic programming and its adaptive variant, referred to as adaptive/approximate dynamic programming (ADP), shall be examined in both discrete and continuous time domains. Neural networks, particularly those with multiple layers, are universally acknowledged for their capacity to approximate nonlinear functions. This inherent property, when applied to learning dynamical systems, facilitates the computation of control policy (a process known as indirect adaptive control of nonlinear systems) or enables the direct approximation of control policies.We shall briefly review the journey of neural network evolution, taking a deep dive into their mathematical structures, universal approximation property, and training mechanisms. Following the discourse on neural networks, we shall focus on optimal control theory. This field of mathematical optimization strives to synthesize a control policy that optimizes a specified objective function and helps steer dynamical systems over time. We shall see that the “curse of dimensionality,” a notorious issue that precludes dynamic programming from being employed in large state space, has led to the advent of ADP. We shall see that, as a unifying approach, ADP allows for the design of optimal control solutions without comprehensive knowledge of system dynamics by combining adaptive control, dynamic programming, and reinforcement learning techniques.