Performance of LQ-learning in POMDP Environments

Lee Haeyeon, Kamaya Hiroyuki, Abe Kenich

doi:10.11499/sicep.2002.0.174.0

説明

In this paper, we propose a new type of LQ-learning to solve POMDP. In the POMDP environment, the agent cannot observe the environment directly. In the LQ-learning, in order to dicriminate partially observed states, the agent attaches label to each observation which perceived as the same ones. Unlike our previous LQ-learning, we make preparations of knowledge about the environment in advance. The knowledge is automatically acquired by Kohenen’s Self-Organizing Map (SOM), which provides the knowledge about state transitions to the agent. Then, LQ-learning agent attaches labels to observations with reference to a map obtained by SOM.

収録刊行物

SICE Annual Conference Program and Abstracts

SICE Annual Conference Program and Abstracts 2002 (0), 174-174, 2002

公益社団法人計測自動制御学会

キーワード

詳細情報詳細情報について

CRID: 1390282680561053568

NII論文ID: 130006960136

DOI: 10.11499/sicep.2002.0.174.0

本文言語コード: en

データソース種別

JaLC
CiNii Articles

抄録ライセンスフラグ: 使用不可

書き出し

問題の指摘