Reward Design Method Adapting to Agents' Learning Ability based on Self-Organizing Map with Evaluation Value

DOI

Bibliographic Information

Other Title
  • 評価値付き入力ベクトルを扱う自己組織化マップを用いたエージェントの学習パラメータに応じた報酬設計手法
  • ヒョウカ ネツケキ ニュウリョク ベクトル オ アツカウ ジコ ソシキカ マップ オ モチイタ エージェント ノ ガクシュウ パラメータ ニ オウジタ ホウシュウ セッケイ シュホウ

Search this article

Abstract

<p>In education for children and guidance of sports, it is important to give appropriate instruction to learners. It is necessary to grasp the ability and characteristic of the learner by observing the learning process and to change the teaching method as needed. In this paper, we consider the learning parameter and appropriate giving rewards method, using simulation data which makes agent learn maze. For learning of the maze, we used Q-learning well known in the field of reinforcement learning. And we conducted experiments using multiple agents with different learning parameters. Agent behavior data at the middle stage of learning is classified by SOM and learning parameters are estimated. After that, we change the giving rewards method, and consider it according to the learning parameters from learning result.</p>

Journal

Details

  • CRID
    1390845713038156672
  • NII Article ID
    130007554399
  • NII Book ID
    AA12165648
  • ISSN
    18820212
  • DOI
    10.14864/fss.34.0_140
  • NDL BIB ID
    029268208
  • Text Lang
    ja
  • Data Source
    • JaLC
    • NDL
    • CiNii Articles
  • Abstract License Flag
    Disallowed

Report a problem

Back to top