書誌事項
- タイトル別名
-
- Optimizing Betting Fraction in Compound Reinforcement Learning
抄録
This paper describes optimization of the betting fraction parameter in compound reinforcement learning. Compound reinforcement learning maximizes the expected logarithm of compound returns in return-based MDPs. However, a new betting fraction parameter is introduced in order not to diverge values to negative infinity and it causes a problem of choosing the parameter. In this paper, we proposed a method to optimize the betting fraction with on-line gradient ascent in compound reinforcement learning.
収録刊行物
-
- 人工知能学会論文誌
-
人工知能学会論文誌 28 (3), 267-272, 2013
一般社団法人 人工知能学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390282680084776576
-
- NII論文ID
- 130003362329
-
- BIBCODE
- 2013TJSAI..28..267M
-
- ISSN
- 13468030
- 13460714
-
- 本文言語コード
- ja
-
- データソース種別
-
- JaLC
- Crossref
- CiNii Articles
- KAKEN
-
- 抄録ライセンスフラグ
- 使用不可