Asynchronous Parallel Learning for Model-Free and Model-Based Reinforcement Learning

DOI
  • UCHIBE Eiji
    Advanced Telecommunications Research Institute International

Bibliographic Information

Other Title
  • モデルフリーとモデルベース強化学習のための非同期並列学習

Abstract

<p>Reinforcement learning algorithms are categorized into model-based methods, which explicitly estimate an environmental model and a reward function, and model-free methods, which directly learn a policy from real or generated experiences. We have proposed the parallel reinforcement learning algorithm for training multiple model-free and model-based reinforcement learners. The experimental results show a simple algorithm can contribute to complex algorithms' learning. However, since each learner's computation time was not considered, we could not fully demonstrate the advantage of using a simple model-free reinforcement learner. This paper proposes an asynchronous parallel reinforcement learning method that considers the differences in control frequencies. The main contribution is separating the replay buffers collected by each learner and transforming the experience replay buffer to absorb the difference in control frequencies. The proposed method is applied to benchmark problems and compared with the case without considering the difference in control frequencies. The results show that the proposed algorithm selected the simple model-based method with a short control frequency in the early stage of learning, the complex model-based method in the middle stage of learning, and the model-free method in the late learning stage.</p>

Journal

Details 詳細情報について

Report a problem

Back to top