TD誤差に基づく強化学習のメタパラメータ学習法

書誌事項

タイトル別名
  • A Meta-Parameter Learning Method in Reinforcement Learning Based on Temporal Difference Error
  • TD ゴサ ニ モトズク キョウカ ガクシュウ ノ メタパラメータ ガクシュウホウ

この論文をさがす

説明

In general, meta-parameters in a reinforcement learning system such as learning rate are empirically determined and fixed during the learning. Therefore, when an external environment has changed, the sytem cannot adjust to the change. Meanwhile, it is suggested that the biological brain could conduct reinforcement learning and adjust to the external environment by controlling neuromodulators corresponding to meta-parameters. In the present paper, based on the above suggestion, a method to adjust meta-parameters using the TD-error is proposed. Through computer simulations using maze problem and inverted pendulum control problem, it is verified that meta-parameters are appropriately adjusted according to the amplitude of the TD-error.

収録刊行物

参考文献 (18)*注記

もっと見る

関連プロジェクト

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ