- 【Updated on May 12, 2025】 Integration of CiNii Dissertations and CiNii Books into CiNii Research
- Trial version of CiNii Research Knowledge Graph Search feature is available on CiNii Labs
- Suspension and deletion of data provided by Nikkei BP
- Regarding the recording of “Research Data” and “Evidence Data”
オンライン強化学習における満足化と高速化および汎化
-
- 片山 晋
- 東京工業大学大学院 総合理工学研究科 知能システム科学専攻 リサーチアソシエート
Search this article
Description
<p>Aiming the implementation of paramenter-free reinforcement learning, the following three pieces of research are presented: -Online reinforcement learning to satisfice: In online reinforcement learning in dynamic environments the exploration rate depends on the nature of the environment. By relaxing the criterion from "optimization" to "satisficing", the proposed method assures the convergence for all parameters, coping with unexpected environmental changes. -Efficient implementation of TD(λ): While the naive implementation of TD(λ) with λ>0 ocsts time complexity linear in the number of states, the proposed method implements TD(λ) precisely and fast by computing each value lazily, or by need. Its computation time per each time step is logarithmic in the number of states, while it needs three times the space complexity of the naive implementation. -TD(λ) using Haar basis functions: An algorithm efficiently implementing TD(λ) learning using Haar basis functions is proposed. The algorithm can maintain and update the information of the infinite tree of coefficients in its finitely compressed form. The system of Haar basis functions includes both broad features, which have strong generalization and averaging ability, and narrow features, which have high precision approximation ability. Especially, TD(λ) for Haar basis functions can approximate arbitrary continuous function on [0, 1) in the limit.</p>
Journal
-
- Journal of the Japanese Society for Artificial Intelligence
-
Journal of the Japanese Society for Artificial Intelligence 15 (6), 998-998, 2000-11-01
The Japanese Society for Artificial Intelligence
- Tweet
Details 詳細情報について
-
- CRID
- 1390848647556169984
-
- NII Article ID
- 110002808366
-
- NII Book ID
- AN10067140
-
- ISSN
- 24358614
- 21882266
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- CiNii Articles
-
- Abstract License Flag
- Disallowed