離散行動空間における教師なしスキルの獲得手法

海野, 良介, 鶴岡, 慶雅

深層強化学習はAtari 2600 や囲碁などのゲームにおいて高いパフォーマンスを達成できるようなった．しかし，課題の一つとして学習を行うエージェントに望み通りの行動を取るように報酬関数を設計することが困難であるという点がある．本論文では，エージェントが離散的な行動を取る環境において外部からの報酬を与えられることなく，一貫性のある行動を取るような方策である「スキル」を学習する手法を提案する．本稿では実験として，行動空間が離散的である二次元グリッド空間，MountainCar-v0，Freeway の3 つのタスクに適用し，多様なスキルの学習であるか，また学習したスキルで階層型強化学習の手法による学習を行った際の性能を検証した．その結果，学習した「スキル」がタスクを達成する上で有用であることがわかった．

Deep reinforcement learning can now achieve high performance in games such as Atari 2600 and Go. However, one of the challenges in reinforcement learning is to design a reward function that leads the agent to learn a policy with desired actions. In this study, we propose a method for learning skills in discrete action spaces without any external rewards. We conducted experiments on three discrete action tasks, namely, 2D grid space, MountainCar-v0, and Freeway, to confirm that the agent can learn diverse sets of skills. We also applied the learned skills to hierarchical reinforcement learning tasks to measure whether the skills can be used in downstream tasks. As a result, we found that learned skills are useful for solving tasks.

離散行動空間における教師なしスキルの獲得手法

書誌事項

抄録

収録刊行物

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

離散行動空間における教師なしスキルの獲得手法

書誌事項

抄録

収録刊行物

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について