ABSTRACT

This chapter presents robust strategies under the imperfectness of the state information and outdated measurement from the players' side. Considering imperfectness in term of payoff measurement from the players' side, it proposes a delayed COmbined fully DIstributed Payoff and Strategy Reinforcement Learning (delayed CODIPAS-RL). The chapter aims to develop heterogeneous, delayed, and CODIPAS-RL framework for the discrete action dynamic games under uncertainty and delayed feedback. It illustrates the learning algorithm with two transmitters and three channels. The problem of competitive Shannon rate maximization is an important signal-processing problem for power-constrained multiuser systems. It involves solving the power allocation problem for mutually interfering transmitters operating across multiple frequencies. A Nash equilibrium of the rate-maximization game is a power allocation configuration such that given the power allocations of other transmitters, no transmitter can further increase the achieved information rate unilaterally.