Acceleration of Reinforcement Learning by Efficient Policy Evaluation

SENDA Kei, HATTORI Suguru, KOHDA Takehisa

doi:10.9746/sicetr.49.696

Acceleration of Reinforcement Learning by Efficient Policy Evaluation

DOI Web Site Web Site 12 References

SENDA Kei

Graduate School of Engineering, Kyoto University
HATTORI Suguru

Graduate School of Engineering, Kyoto University
KOHDA Takehisa

Graduate School of Engineering, Kyoto University

Bibliographic Information

Other Title

強化学習における方策評価の効率化による学習の加速
キョウカガクシュウニオケルホウサクヒョウカノコウリツカニヨルガクシュウノカソク

Search this article

Abstract

Typical methods for solving reinforcement learning problems iterate two steps, policy evaluation and policy improvement. This study proposes algorithms for the policy evaluation to improve learning efficiency. The proposed algorithms, based on the Krylov Subspace Method (KSM), are tens to hundreds times more efficient than existing algorithms based on the Stationary Iterative Methods (SIM). Algorithms based on KSM are far more efficient than they have been generally expected. This study clarifies what makes algorithms based on KSM makes more efficient with numerical examples and theoretical discussions.

Journal

Transactions of the Society of Instrument and Control Engineers

Transactions of the Society of Instrument and Control Engineers 49 (7), 696-702, 2013

The Society of Instrument and Control Engineers

References(12)*help

Keywords

Details 詳細情報について

CRID

1390282679479140992
NII Article ID

10031188141
NII Book ID

AN00072392
DOI

10.9746/sicetr.49.696
ISSN

18838189

04534654
NDL BIB ID

024821264
Web Site

http://id.ndl.go.jp/bib/024821264

https://ndlsearch.ndl.go.jp/books/R000000004-I024821264

https://www.jstage.jst.go.jp/article/sicetr/49/7/49_696/_pdf
Text Lang

ja
Data Source
- JaLC
- NDL
- Crossref
- CiNii Articles
Abstract License Flag
Disallowed

Acceleration of Reinforcement Learning by Efficient Policy Evaluation

Bibliographic Information

Search this article

Abstract

Journal

References(12)*help

Keywords

Details 詳細情報について

Export

Report a problem