Proposal of a Propagation Algorithm of the Expected Failure Probability and the Effectiveness on Multi-agent Environments
-
- Muraoka Hiroki
- Engine Engineering Div., Hino Motors, Ltd.
-
- Miyazaki Kazuteru
- Research Department, National Institution for Academic Degrees and University Evaluation
-
- Kobayashi Hiroaki
- Department of Mechanical Engineering Informatics, Meiji University
Bibliographic Information
- Other Title
-
- 失敗確率伝播アルゴリズムEFPAの提案とマルチエージェント環境下での有効性の検証
- シッパイ カクリツ デンパ アルゴリズム EFPA ノ テイアン ト マルチエージェント カンキョウ カ デ ノ ユウコウセイ ノ ケンショウ
Search this article
Abstract
It is known that Improved Penalty Avoiding Rational Policy Making algorithm (IPARP) can learn policies by a reward and a penalty. IPARP aims to identify penalty rules that have a high possibility to receive a penalty. Though IPARP is effective in many cases, it needs many trial-and-error searches due to memory constraints. In this paper, we propose a method called Expected Failure Probability Algorithm (EFPA) to speed it up. In addition, we extend EFPA to multi-agent environments. In multi-agent learning, it is important to avoid concurrent learning problem that occurs when multiple agents learn simultaneously. We also propose a method to avoid the problem and confirm the effectiveness by numerical experiments.
Journal
-
- IEEJ Transactions on Electronics, Information and Systems
-
IEEJ Transactions on Electronics, Information and Systems 136 (3), 273-281, 2016
The Institute of Electrical Engineers of Japan
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390001204607796736
-
- NII Article ID
- 130005132275
-
- NII Book ID
- AN10065950
-
- ISSN
- 13488155
- 03854221
-
- NDL BIB ID
- 027160085
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL
- Crossref
- CiNii Articles
- KAKEN
-
- Abstract License Flag
- Disallowed