ABSTRACT

A core novelty of Alpha Zero (Google Deepmind project) is the mix of tree search and deep learning, which has proven very efficient in board games like Chess, Shogi, and Go. These games have a discrete action space, however many reinforcement learning domains have an impact in the real world, for example in robotic control, navigation, and self-driving cars. This paper presents the effectiveness of such concepts through extensions of Alpha Zero. The paper shows the theoretical comparison of two search tree algorithms (Monto Carlo and Alpha Beta search tree) using two different engines of Chess (Stockfish and AlphaZero), thereby providing the first step toward unsupervised learning.