強化学習における方策の性能を向上するサンプリング方策

泉田 啓, 大坪 立サミュエル

doi:10.9746/sicetr.54.365

強化学習における方策の性能を向上するサンプリング方策

DOI Web Site Web Site 参考文献8件

泉田啓

京都大学大学院工学研究科
大坪立サミュエル

京都大学大学院工学研究科

書誌事項

タイトル別名

Sampling Policy that Improves Performance of Policy in Reinforcement Learning
キョウカガクシュウニオケルホウサクノセイノウオコウジョウスルサンプリングホウサク

この論文をさがす

抄録

<p>When applying a reinforcement learning method, the estimation accuracy of the state transition probabilities affects the performance of the policy obtained from the estimated plant. Therefore, we find a sampling condition guaranteeing that the optimal policy from the estimated plant is also optimal for the real plant with the desired degree of reliability, and a sampling methods based on it is proposed. Not by the sampling for the reliability in which the policy is optimal for the real plant, but by the sampling for the policy to be effective irrespective of estimation errors, we can further reduce the number of samples. We show the problem setting for finding the policy which is guaranteed to be effective for estimation errors from the real transition probabilities with the desired degree of reliability, and we propose a sampling method as a solution of this problem. The effectiveness of the proposed method is verified by numerical simulations.</p>

収録刊行物

計測自動制御学会論文集

計測自動制御学会論文集 54 (3), 365-372, 2018

公益社団法人計測自動制御学会

参考文献 (8)*注記

詳細情報詳細情報について

CRID

1390282679486391808
NII論文ID

130006512912
NII書誌ID

AN00072392
DOI

10.9746/sicetr.54.365
ISSN

18838189

04534654
NDL書誌ID

028916196
Web Site

http://id.ndl.go.jp/bib/028916196

https://ndlsearch.ndl.go.jp/books/R000000004-I028916196

https://www.jstage.jst.go.jp/article/sicetr/54/3/54_365/_pdf
本文言語コード

ja
データソース種別
- JaLC
- NDL
- Crossref
- CiNii Articles
- KAKEN
抄録ライセンスフラグ
使用不可

書き出し

問題の指摘

ページトップへ

強化学習における方策の性能を向上するサンプリング方策

書誌事項

この論文をさがす

抄録

収録刊行物

参考文献 (8)*注記

関連プロジェクト

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

強化学習における方策の性能を向上するサンプリング方策

書誌事項

この論文をさがす

抄録

収録刊行物

参考文献 (8)*注記

関連プロジェクト

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について