Parallel Deep Reinforcement Learning with Model-Free and Model-Based Methods

UCHIBE Eiji

doi:10.11517/pjsai.jsai2020.0_1q4gs1103

Bibliographic Information

Other Title

モデルフリーとモデルベースの協同による並列深層強化学習

Description

<p>Reinforcement learning can be categorized into model-based methods that exploit an (estimated) environmental model, and model-free methods that directly learn a policy through the interaction with the environment. To improve learning efficiency, we have proposed CRAIL, which dynamically selects a learning module from multiple heterogeneous modules according to learning performance while multiple modules are trained simultaneously. However, CRAIL does not consider model-based methods. This study extends CRAIL to deal with model-based and model-free methods and investigates whether dynamic switching between them contributes to the improvement of learning efficiency. The proposed method was evaluated by MuJoCo benchmark tasks. Experimental results show that a model-based method with a simple model was selected at the early stage of learning, and a model-based method with a complicated model was used at the later stage. Furthermore, model-free methods were selected when the network did not have sufficient capacity to represent the environmental dynamics.</p>

Journal

Proceedings of the Annual Conference of JSAI

Proceedings of the Annual Conference of JSAI JSAI2020 (0), 1Q4GS1103-1Q4GS1103, 2020

The Japanese Society for Artificial Intelligence

Keywords

Details 詳細情報について

CRID: 1390848250119369088

NII Article ID: 130007856694

DOI: 10.11517/pjsai.jsai2020.0_1q4gs1103

Text Lang: ja

Data Source

JaLC
CiNii Articles

Abstract License Flag: Disallowed

Export