Reward Design Method Adapting to Agents' Learning Ability based on Self-Organizing Map with Evaluation Value
-
- HORIO Keiichi
- Kyushu Institute of Technology
-
- MORI Ippei
- Kyushu Institute of Technology
-
- FURUKAWA Tetsuo
- Kyushu Institute of Technology
Bibliographic Information
- Other Title
-
- 評価値付き入力ベクトルを扱う自己組織化マップを用いたエージェントの学習パラメータに応じた報酬設計手法
- ヒョウカ ネツケキ ニュウリョク ベクトル オ アツカウ ジコ ソシキカ マップ オ モチイタ エージェント ノ ガクシュウ パラメータ ニ オウジタ ホウシュウ セッケイ シュホウ
Search this article
Abstract
<p>In education for children and guidance of sports, it is important to give appropriate instruction to learners. It is necessary to grasp the ability and characteristic of the learner by observing the learning process and to change the teaching method as needed. In this paper, we consider the learning parameter and appropriate giving rewards method, using simulation data which makes agent learn maze. For learning of the maze, we used Q-learning well known in the field of reinforcement learning. And we conducted experiments using multiple agents with different learning parameters. Agent behavior data at the middle stage of learning is classified by SOM and learning parameters are estimated. After that, we change the giving rewards method, and consider it according to the learning parameters from learning result.</p>
Journal
-
- Proceedings of the Fuzzy System Symposium
-
Proceedings of the Fuzzy System Symposium 34 (0), 140-143, 2018
Japan Society for Fuzzy Theory and Intelligent Informatics
- Tweet
Details
-
- CRID
- 1390845713038156672
-
- NII Article ID
- 130007554399
-
- NII Book ID
- AA12165648
-
- ISSN
- 18820212
-
- NDL BIB ID
- 029268208
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL
- CiNii Articles
-
- Abstract License Flag
- Disallowed