- 【Updated on May 12, 2025】 Integration of CiNii Dissertations and CiNii Books into CiNii Research
- Trial version of CiNii Research Automatic Translation feature is available on CiNii Labs
- Suspension and deletion of data provided by Nikkei BP
- Regarding the recording of “Research Data” and “Evidence Data”
The Effect of UCB Algorithm in Reinforcement Learning
-
- Saito Koki
- Osaka Prefecture University
-
- Notsu Akira
- Osaka Prefecture University
-
- Honda Katsuhiro
- Osaka Prefecture University
Bibliographic Information
- Other Title
-
- 強化学習におけるUCB行動選択手法の効果
Description
UCB algorithm was proposed as one of the action choice methods used in a multi-armed bandit problem. In this method, an agent chooses the action by comparing upper bound of confidence intervals of estimated values, thereby it has a better performance than others, like ε-greedy. In this paper, we proposed the method to apply UCB algorithm to Q-learning, and experimentally evaluated its performance by the shortest path problem in the continuous state spaces.
Journal
-
- Proceedings of the Fuzzy System Symposium
-
Proceedings of the Fuzzy System Symposium 30 (0), 174-179, 2014
Japan Society for Fuzzy Theory and Intelligent Informatics
- Tweet
Details 詳細情報について
-
- CRID
- 1390001205673422976
-
- NII Article ID
- 130005480437
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- CiNii Articles
-
- Abstract License Flag
- Disallowed