強化学習法のための状態グルーピングとオポチュニチ評価に関する研究

書誌事項

タイトル別名
  • A Study on State Grouping and Opportunity Evaluation for Reinforcement Learning Methods
  • キョウカ ガクシュウホウ ノ タメ ノ ジョウタイ グルーピング ト オポチュ

この論文をさがす

抄録

In this paper, we propose the State Grouping scheme for coping with the problem of scaling up the Reinforcement Learning Algorithm to real, large size application. The grouping scheme is based on geographical and trial-error information, and is made up with state generating, state combining, state splitting, state forgetting procedures, with corresponding action selecting module and learning module. Also, we discuss the Labeling Based Evaluation scheme which can evaluate the opportunity of the state-action pair, therefore, use better experience to guide the exploration of the state-space effectively. Incorporating the Labeling Based Evaluation and State Grouping scheme into the Reinforcement Learning Algorithm, we get the approach that can generate organized state space for Reinforcement Learning, and do problem solving as well. We argue that the approach with this kind of ability is necessary for autonomous agent, namely, autonomous agent can not act depending on any pre-defined map, instead, it should search the environment as well as find the optimal problem solution autonomously and simultaneously. By solving the large state-size 3-DOF and 4-link manipulator problem, we show the efficiency of the proposed approach, i.e., the agent can achieve the optimal or sub-optimal path with less memory and less time.

収録刊行物

参考文献 (10)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ