強化学習における方策評価の効率化による学習の加速

泉田 啓, 服部 俊, 幸田 武久

doi:10.9746/sicetr.49.696

強化学習における方策評価の効率化による学習の加速

DOI Web Site Web Site 参考文献12件

泉田啓

京都大学大学院工学研究科
服部俊

京都大学大学院工学研究科
幸田武久

京都大学大学院工学研究科

書誌事項

タイトル別名

Acceleration of Reinforcement Learning by Efficient Policy Evaluation
キョウカガクシュウニオケルホウサクヒョウカノコウリツカニヨルガクシュウノカソク

この論文をさがす

抄録

Typical methods for solving reinforcement learning problems iterate two steps, policy evaluation and policy improvement. This study proposes algorithms for the policy evaluation to improve learning efficiency. The proposed algorithms, based on the Krylov Subspace Method (KSM), are tens to hundreds times more efficient than existing algorithms based on the Stationary Iterative Methods (SIM). Algorithms based on KSM are far more efficient than they have been generally expected. This study clarifies what makes algorithms based on KSM makes more efficient with numerical examples and theoretical discussions.

収録刊行物

計測自動制御学会論文集

計測自動制御学会論文集 49 (7), 696-702, 2013

公益社団法人計測自動制御学会

参考文献 (12)*注記

詳細情報詳細情報について

CRID

1390282679479140992
NII論文ID

10031188141
NII書誌ID

AN00072392
DOI

10.9746/sicetr.49.696
ISSN

18838189

04534654
NDL書誌ID

024821264
Web Site

http://id.ndl.go.jp/bib/024821264

https://ndlsearch.ndl.go.jp/books/R000000004-I024821264

https://www.jstage.jst.go.jp/article/sicetr/49/7/49_696/_pdf
本文言語コード

ja
データソース種別
- JaLC
- NDL
- Crossref
- CiNii Articles
抄録ライセンスフラグ
使用不可

書き出し

問題の指摘

ページトップへ

強化学習における方策評価の効率化による学習の加速

書誌事項

この論文をさがす

抄録

収録刊行物

参考文献 (12)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について