An Actor-Critic Algorithm Using a Binary Tree Action Selector
-
- KIMURA Hajime
- Interdisciplinary Graduate School of Sci. and Eng. Tokyo Institute of Technology
-
- KOBAYASHI Shigenobu
- Interdisciplinary Graduate School of Sci. and Eng. Tokyo Institute of Technology
Bibliographic Information
- Other Title
-
- 確率的2分木の行動選択を用いたActor-Criticアルゴリズム
- 確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習
- カクリツテキ 2ブンギ ノ コウドウ センタク オ モチイタ Actor Critic アルゴリズム タスウ ノ コウドウ オ アツカウ キョウカ ガクシュウ
- Reinforcement Learning to Cope with Enormous Actions
- 多数の行動を扱う強化学習
Search this article
Abstract
In real world applications, learning algorithms often have to handle several dozens of actions, which have some distance metrics. Epsilon-greedy or Boltzmann distribution exploration strategies, which have been applied for Q-learning or SARSA, are very popular, simple and effective in the problems that have a few actions, however, the efficiency would decrease when the number of actions is increased. We propose a policy function representation that consists of a stochastic binary decision tree, and we apply it to an actor-critic algorithm for the problems that have enormous similar actions. Simulation results show the increase of the actions does not affect learning curves of the proposed method at all.
Journal
-
- Transactions of the Society of Instrument and Control Engineers
-
Transactions of the Society of Instrument and Control Engineers 37 (12), 1147-1155, 2001
The Society of Instrument and Control Engineers
- Tweet
Details 詳細情報について
-
- CRID
- 1390282679477869056
-
- NII Article ID
- 130003970998
- 10007403471
-
- NII Book ID
- AN00072392
-
- ISSN
- 18838189
- 04534654
- http://id.crossref.org/issn/04534654
-
- NDL BIB ID
- 6020326
-
- Data Source
-
- JaLC
- NDL
- Crossref
- CiNii Articles
-
- Abstract License Flag
- Disallowed