Proposal of a Propagation Algorithm of the Expected Failure Probability and the Effectiveness on Multi-agent Environments

Bibliographic Information

Other Title
  • 失敗確率伝播アルゴリズムEFPAの提案とマルチエージェント環境下での有効性の検証
  • シッパイ カクリツ デンパ アルゴリズム EFPA ノ テイアン ト マルチエージェント カンキョウ カ デ ノ ユウコウセイ ノ ケンショウ

Search this article

Abstract

It is known that Improved Penalty Avoiding Rational Policy Making algorithm (IPARP) can learn policies by a reward and a penalty. IPARP aims to identify penalty rules that have a high possibility to receive a penalty. Though IPARP is effective in many cases, it needs many trial-and-error searches due to memory constraints. In this paper, we propose a method called Expected Failure Probability Algorithm (EFPA) to speed it up. In addition, we extend EFPA to multi-agent environments. In multi-agent learning, it is important to avoid concurrent learning problem that occurs when multiple agents learn simultaneously. We also propose a method to avoid the problem and confirm the effectiveness by numerical experiments.

Journal

Citations (3)*help

See more

References(16)*help

See more

Related Projects

See more

Details 詳細情報について

Report a problem

Back to top