Policy Learning Using Modified Learning Vector Quantization for Reinforcement Learning Problems

Afif Mohd Faudzi Ahmad, 村田 純一

doi:10.15017/1560523

抄録

Reinforcement learning (RL) enables an agent to _nd an optimal solution to a problem by interacting with the environment. In the previous research, Q-learning, one of the popular learning meth-ods in RL, is used to generate a policy. From it, abstract policy is extracted by LVQ algorithm. In this paper, the aim is to train the agent to learn an optimal policy from scratch as well as to generate the abstract policy in a single operation by LVQ algorithm. When applying LVQ algorithm in a RL frame-work, due to an erroneous teaching signal in LVQ algorithm, the learning sometimes end up with failure or with non-optimal solution. Here, a new LVQ algorithm is proposed to overcome this problem. The new LVQ algorithm introduce, _rst, a regular reward that is obtained by the agent autonomously based on its behavior and second, a function that convert a regular reward to a new reward so that the learning system does not su_er from an undesirable e_ect by a small reward. Through these modi_cations, the agent is expected to _nd the optimal solution more e_ciently.

収録刊行物

九州大学大学院システム情報科学紀要

九州大学大学院システム情報科学紀要 20 (2), 39-44, 2015-07-24

九州大学大学院システム情報科学研究院

キーワード

詳細情報詳細情報について

CRID: 1390572174796264192

NII論文ID: 120005697190

NII書誌ID: AN10569524

DOI: 10.15017/1560523

ISSN: 21880891; 13423819

HANDLE: 2324/1560523

本文言語コード: en

データソース種別

JaLC
IRDB
CiNii Articles

抄録ライセンスフラグ: 使用可

Policy Learning Using Modified Learning Vector Quantization for Reinforcement Learning Problems

この論文をさがす

抄録

収録刊行物

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

Policy Learning Using Modified Learning Vector Quantization for Reinforcement Learning Problems

この論文をさがす

抄録

収録刊行物

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について