Heterogeneous Computing of Multi-Agent Deep Reinforcement Learning on Edge Devices for Internet of Things

doi:10.1201/9781003363606-2

Chapter

Heterogeneous Computing of Multi-Agent Deep Reinforcement Learning on Edge Devices for Internet of Things

ABSTRACT

In this era of information, the computation of data at the edge of the network for Internet of Things (IoT) devices has been necessitated due to the increasing number of IoT applications that rely on artificial intelligence (AI). With increased traction and the shifting interests of the AI research community in pursuing the area of behavioural learning, applications involving the use of Multi-Agent Deep Reinforcement Learning (MADRL) are going to play a very important role, at both the industrial and the consumer level. The major challenge of executing such applications based on the MADRL approach at the edge of the network is that it requires a great deal of both hardware and huge memory resources – both of which, in turn, lead to increased power consumption. This work proposes a Field Programmable Gate Array (FPGA)-based novel hardware architecture for the heterogeneous computing of such MADRL-driven IoT applications. The viability and validity of the proposed approach is demonstrated for the Collaborative Box Push application, through the use of the available on-chip memory. The proposed hardware architecture is designed using the Xilinx Vivado HLS 2019.2 and synthesis, implemented flow generated bitstream using Xilinx Vivado 2019.2 while targeting it for the Avnet Ultra96 v2 Embedded FPGA board. Validation was done using the Xilinx PYNQ framework. The proposed approach results in an acceleration of 127x when compared to a software implementation in Intel i3-8130U CPU. Resources utilization and power statistics for the Avnet Ultra96 v2 Embedded FPGA board are presented, to prove the viability of the proposed MADRL hardware architecture to be deployed at the edge of the network.