尤度情報に基づく温度分布を用いた強化学習法

小堀 訓成, 鈴木 健嗣, ハルトノ， ピトヨ．, 橋本 周司

doi:10.1527/tjsai.20.297

尤度情報に基づく温度分布を用いた強化学習法

DOI Web Site Web Site 被引用文献3件参考文献21件

小堀訓成

早稲田大学大学院理工学研究科物理学及応用物理学専攻
鈴木健嗣

筑波大学大学院システム情報工学研究科
ハルトノ，ピトヨ．

公立はこだて未来大学情報アーキテクチャ学科
橋本周司

早稲田大学理工学部応用物理学科

書誌事項

タイトル別名

Reinforcement Learning with temperature distribution based on likelihood function
ユウドジョウホウニモトヅクオンドブンプオモチイタキョウカガクシュウホウ

この論文をさがす

抄録

In the existing Reinforcement Learning, it is difficult and time consuming to find appropriate the meta-parameters such as learning rate, eligibility traces and temperature for exploration, in particular on a complicated and large-scale problem, the delayed reward often occurs and causes a difficulty in solving the problem. In this paper, we propose a novel method introducing a temperature distribution for reinforcement learning. In addition to the acquirement of policy based on profit sharing, the temperature is given to each state and is trained by hill-climbing method using likelihood function based on success and failure of the task. The proposed method can reduce the parameter setting according to the given problems. We showed the performance on the grid world problem and the control of Acrobot.

収録刊行物

人工知能学会論文誌

人工知能学会論文誌 20 297-305, 2005

一般社団法人人工知能学会

被引用文献 (3)*注記

参考文献 (21)*注記

詳細情報詳細情報について

CRID

1390001205108859904
NII論文ID

10022005408
NII書誌ID

AA11579226
DOI

10.1527/tjsai.20.297
ISSN

13468030

13460714
NDL書誌ID

8685270
Web Site

http://id.ndl.go.jp/bib/8685270

https://ndlsearch.ndl.go.jp/books/R000000004-I8685270

http://www.jstage.jst.go.jp/article/tjsai/20/4/20_4_297/_pdf
本文言語コード

ja
データソース種別
- JaLC
- NDL
- Crossref
- CiNii Articles
抄録ライセンスフラグ
使用不可

書き出し

問題の指摘

ページトップへ

尤度情報に基づく温度分布を用いた強化学習法

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (3)*注記

参考文献 (21)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について