Rationality of reward sharing in multi-agent reinforcement learning

被引：9

作者：

Miyazaki, K

Kobayashi, S

机构：

[1] Natl Inst Acad Degrees, Bunkyo Ku, Tokyo 1120012, Japan

[2] Tokyo Inst Technol, Midori Ku, Yokohama, Kanagawa 2268502, Japan

来源：

NEW GENERATION COMPUTING | 2001年 / 19卷 / 02期

关键词：

reinforcement learning; multi-agent system; profit sharing; rationality theorem; direct and indirect rewards;

D O I：

10.1007/BF03037252

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In multi-agent reinforcement learning systems, it is important to share a reward among all agents. We focus on the Rationality Theorem of Profit Sharing(5)) and analyze how to share a reward among all profit sharing agents. When an agent gets a direct reward R (R > 0), an indirect reward muR (mu greater than or equal to 0) is given to the other agents. We have derived the necessary and sufficient condition to preserve the rationality as follows; mu < M-1/M-W(1 - (1/M)(W)(0))(n - 1)L' where M and L are the maximum number of conflicting all rules and rational rules in the same sensory input, W and W-0 are the maximum episode length of a direct and an indirect-reward agents, and n is the number of agents. This theory is derived by avoiding the least desirable situation whose expected reward per an action is zero. Therefore, if we use this theorem, we can experience several efficient aspects of reward sharing. Through numerical examples, we confirm the effectiveness of this theorem.

引用

页码：157 / 172

页数：16

共 50 条

[1] Rationality of reward sharing in multi-agent reinforcement learning
Kazuteru Miyazaki
Shigenobu Kobayashi
[J]. New Generation Computing, 2001, 19 : 157 - 172
[2] On the rationality of Profit Sharing in multi-agent reinforcement learning
Miyazaki, K
Kobayashi, S
[J]. ICCIMA 2001: FOURTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, PROCEEDINGS, 2001, : 123 - 127
[3] On the rationality of profit sharing in multi-agent reinforcement learning
Miyazaki, K
Kobayashi, S
[J]. ICCIMA 2001: FOURTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, PROCEEDINGS, 2001, : 421 - 425
[4] Multi-Agent Reinforcement Learning with Reward Delays
Zhang, Yuyang
Zhang, Runyu
Gu, Yuantao
Li, Na
[J]. LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[5] Direct reward and indirect reward in multi-agent reinforcement learning
Ohta, M
[J]. ROBOCUP 2002: ROBOT SOCCER WORLD CUP VI, 2003, 2752 : 359 - 366
[6] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
Chen, Hao
Yang, Guangkai
Zhang, Junge
Yin, Qiyue
Huang, Kaiqi
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[7] Cooperative Multi-Agent Reinforcement Learning with Dynamic Target Localization: A Reward Sharing Approach
Wickramaarachchi, Helani
Kirley, Michael
Geard, Nicholas
[J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2023, PT II, 2024, 14472 : 310 - 324
[8] Individual Reward Assisted Multi-Agent Reinforcement Learning
Wang, Li
Zhang, Yupeng
Hu, Yujing
Wang, Weixun
Zhang, Chongjie
Gao, Yang
Hao, Jianye
Lv, Tangjie
Fan, Changjie
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[9] Autonomous learning of reward distribution for each agent in multi-agent reinforcement learning
Shibata, K
Ito, K
[J]. INTELLIGENT AUTONOMOUS SYSTEMS 6, 2000, : 495 - 502
[10] A Multi-agent Reinforcement Learning with Weighted Experience Sharing
Yu, Lasheng
Abdulai, Issahaku
[J]. ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2012, 6839 : 219 - 225

← 1 2 3 4 5 →