Multi-Agent Incentive Communication via Decentralized Teammate Modeling

被引：0

作者：

Yuan, Lei ^{[1
,3
]}

Wang, Jianhao ^{[2
]}

Zhang, Fuxiang ^{[1
]}

Wang, Chenghe ^{[1
]}

Zhang, Zongzhang ^{[1
]}

Yu, Yang ^{[1
,3
,4
]}

Zhang, Chongjie ^{[2
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China

[2] Tsinghua Univ, Inst Interdisciplinary Informat Sci, Beijing 100084, Peoples R China

[3] Polixir Technol, Nanjing 210000, Peoples R China

[4] Peng Cheng Lab, Shenzhen 518055, Peoples R China

来源：

THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Effective communication can improve coordination in cooperative multi-agent reinforcement learning (MARL). One popular communication scheme is exchanging agents' local observations or latent embeddings and using them to augment individual local policy input. Such a communication paradigm can reduce uncertainty for local decision-making and induce implicit coordination. However, it enlarges agents' local policy spaces and increases learning complexity, leading to poor coordination in complex settings. To handle this limitation, this paper proposes a novel framework named Multi-Agent Incentive Communication (MAIC) that allows each agent to learn to generate incentive messages and bias other agents' value functions directly, resulting in effective explicit coordination. Our method firstly learns targeted teammate models, with which each agent can anticipate the teammate's action selection and generate tailored messages to specific agents. We further introduce a novel regularization to leverage interaction sparsity and improve communication efficiency. MAIC is agnostic to specific MARL algorithms and can be flexibly integrated with different value function factorization methods. Empirical results demonstrate that our method significantly outperforms baselines and achieves excellent performance on multiple cooperative MARL tasks.

引用

页码：9466 / 9474

页数：9

共 50 条

[31] Research of communication mechanism of the multi-agent in multi-agent robot systems
Gao, Zhijun
Yan, Guozheng
Ding, Guoqing
Huang, Heng
High Technology Letters, 2002, 8 (01) : 67 - 71
[32] Multi-Agent Reinforcement Learning With Distributed Targeted Multi-Agent Communication
Xu, Chi
Zhang, Hui
Zhang, Ya
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2915 - 2920
[33] Research of Communication Mechanism of the Multi-agent in Multi-agent Robot Systems
高志军
High Technology Letters, 2002, (01) : 67 - 71
[34] A Decentralized Framework for Multi-Agent Robotic Systems
Jimenez, Andres C.
Garcia-Diaz, Vicente
Bolanos, Sandro
SENSORS, 2018, 18 (02):
[35] Implementing ReGreT in a decentralized multi-agent environment
Koenig, Stefan
Kaffille, Sven
Wirtz, Guido
MULTIAGENT SYSTEM TECHNOLOGIES, PROCEEDINGS, 2007, 4687 : 194 - +
[36] Decentralized Degree Regularization for Multi-Agent Networks
Yazicioglu, A. Yasin
Egerstedt, Magnus
Shamma, Jeff S.
2013 IEEE 52ND ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2013, : 7498 - 7503
[37] Decentralized Multi-Agent Algorithm for Voltage Control
Panasetsky, Daniil
Tomin, Nikita
Sidorov, Denis
Kurbatsky, Viktor
Osak, Alexey
2016 IEEE PES ASIA-PACIFIC POWER AND ENERGY ENGINEERING CONFERENCE (APPEEC), 2016, : 852 - 856
[38] Decentralized Hybrid Control of Multi-agent Systems
Zhao Yuqiang
Sun Zhendong
PROCEEDINGS OF THE 27TH CHINESE CONTROL CONFERENCE, VOL 2, 2008, : 637 - 640
[39] Decentralized Deterministic Multi-Agent Reinforcement Learning
Grosnit, Antoine
Cai, Desmond
Wynter, Laura
2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 1548 - 1553
[40] Decentralized H∞ Filtering in a Multi-Agent System
Nelson, Thomas R.
Freeman, Randy A.
2009 AMERICAN CONTROL CONFERENCE, VOLS 1-9, 2009, : 5755 - 5760

← 1 2 3 4 5 →