An Actor-Critic Algorithm Using a Binary Tree Action Selector

KIMURA Hajime, KOBAYASHI Shigenobu

doi:10.9746/sicetr1965.37.1147

Bibliographic Information

Other Title

確率的2分木の行動選択を用いたActor-Criticアルゴリズム
確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習
カクリツテキ 2ブンギノコウドウセンタクオモチイタ Actor Critic アルゴリズムタスウノコウドウオアツカウキョウカガクシュウ
Reinforcement Learning to Cope with Enormous Actions
多数の行動を扱う強化学習

Search this article

Abstract

In real world applications, learning algorithms often have to handle several dozens of actions, which have some distance metrics. Epsilon-greedy or Boltzmann distribution exploration strategies, which have been applied for Q-learning or SARSA, are very popular, simple and effective in the problems that have a few actions, however, the efficiency would decrease when the number of actions is increased. We propose a policy function representation that consists of a stochastic binary decision tree, and we apply it to an actor-critic algorithm for the problems that have enormous similar actions. Simulation results show the increase of the actions does not affect learning curves of the proposed method at all.

Journal

Transactions of the Society of Instrument and Control Engineers

Transactions of the Society of Instrument and Control Engineers 37 (12), 1147-1155, 2001

The Society of Instrument and Control Engineers

Keywords

Details 詳細情報について

CRID: 1390282679477869056

NII Article ID: 130003970998; 10007403471

NII Book ID: AN00072392

DOI: 10.9746/sicetr1965.37.1147

ISSN: 18838189; 04534654; http://id.crossref.org/issn/04534654

NDL BIB ID: 6020326

Web Site: https://ndlsearch.ndl.go.jp/books/R000000004-I6020326; https://www.jstage.jst.go.jp/article/sicetr1965/37/12/37_12_1147/_pdf

Data Source

JaLC
NDL
Crossref
CiNii Articles

Abstract License Flag: Disallowed

Export

An Actor-Critic Algorithm Using a Binary Tree Action Selector

Bibliographic Information

Search this article

Abstract

Journal

Citations (3)*help

References(12)*help

Keywords

Details 詳細情報について

Export

Report a problem

An Actor-Critic Algorithm Using a Binary Tree Action Selector

Bibliographic Information

Search this article

Abstract

Journal

Citations (3)*help

References(12)*help

Keywords

Details 詳細情報について

Export

Report a problem

Project list