書誌事項
- タイトル別名
-
- Hybrid Learning Using Profit Sharing and Genetic Algorithm under the POMDPs
- フカンゼン チカク モンダイ ニ タイスル Profit Sharing ト イデンテキ アルゴリズム オ モチイタ ハイブリッド ガクシュウ
この論文をさがす
説明
<p>Reinforcement learning is generally performed in the Markov decision processes (MDP). However, there is a possibility that the agent can not correctly observe the environment due to the perception ability of the sensor. This is called partially observable Markov decision processes (POMDP). In a POMDP environment, an agent may observe the same information at more than one state. HQ-learning and Episode-based Profit Sharing (EPS) are well known methods for this problem. HQ-learning divides a POMDP environment into subtasks. EPS distributes same reward to state-action pairs in the episode when an agent achieves a goal. However, these methods have disadvantages in learning efficiency and localized solutions. In this paper, we propose a hybrid learning method combining PS and genetic algorithm. We also report the effectiveness of our method by some experiments with partially observable mazes.</p>
収録刊行物
-
- 電気学会論文誌C(電子・情報・システム部門誌)
-
電気学会論文誌C(電子・情報・システム部門誌) 137 (12), 1591-1599, 2017
一般社団法人 電気学会
- Tweet
キーワード
詳細情報 詳細情報について
-
- CRID
- 1390001204609709696
-
- NII論文ID
- 130006235400
-
- NII書誌ID
- AN10065950
-
- ISSN
- 13488155
- 03854221
-
- NDL書誌ID
- 028724878
-
- 本文言語コード
- ja
-
- データソース種別
-
- JaLC
- NDLサーチ
- Crossref
- CiNii Articles
- OpenAIRE
-
- 抄録ライセンスフラグ
- 使用不可