ABSTRACT

An unsupervised learning method of an evaluation function is proposed. By this function, a robot can learn a series of motion to achieve a given purpose. The evaluation function is calculated by a neural network and time derivative of this function is reduced during motions with random trials. We confirmed by simulation that a robot with asymmetric motion feature could obtain an appropriate asymmetric evaluation function and become to achieve a given purpose along an almost optimal path.