Promoting Coordination through Policy Regularization in Multi-Agent Deep Reinforcement Learning

被引：0

作者：

Roy, Julien ^{[1
]}

Barde, Paul ^{[2
]}

Harvey, Felix G. ^{[1
]}

Nowrouzezahrai, Derek ^{[2
]}

Pal, Christopher ^{[1
,3
]}

机构：

[1] Polytech Montreal, Quebec AI Inst Mila, Montreal, PQ, Canada

[2] McGill Univ, Quebec AI Inst Mila, Montreal, PQ, Canada

[3] Element AI, Montreal, PQ, Canada

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020 | 2020年 / 33卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In multi-agent reinforcement learning, discovering successful collective behaviors is challenging as it requires exploring a joint action space that grows exponentially with the number of agents. While the tractability of independent agent-wise exploration is appealing, this approach fails on tasks that require elaborate group strategies. We argue that coordinating the agents' policies can guide their exploration and we investigate techniques to promote such an inductive bias. We propose two policy regularization methods: TeamReg, which is based on inter-agent action predictability and CoachReg that relies on synchronized behavior selection. We evaluate each approach on four challenging continuous control tasks with sparse rewards that require varying levels of coordination as well as on the discrete action Google Research Football environment. Our experiments show improved performance across many cooperative multi-agent problems. Finally, we analyze the effects of our proposed methods on the policies that our agents learn and show that our methods successfully enforce the qualities that we propose as proxies for coordinated behaviors.

引用

页数：12

共 50 条

[21] Multi-agent deep reinforcement learning: a survey
Gronauer, Sven
Diepold, Klaus
ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (02) : 895 - 943
[22] Learning to Communicate with Deep Multi-Agent Reinforcement Learning
Foerster, Jakob N.
Assael, Yannis M.
de Freitas, Nando
Whiteson, Shimon
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[23] Distributed Coordination Guidance in Multi-Agent Reinforcement Learning
Lau, Qiangfeng Peter
Lee, Mong Li
Hsu, Wynne
2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 456 - 463
[24] Reinforcement learning of coordination in cooperative multi-agent systems
Kapetanakis, S
Kudenko, D
EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 326 - 331
[25] Evaluating the Coordination of Agents in Multi-agent Reinforcement Learning
Barton, Sean L.
Zaroukian, Erin
Asher, Derrik E.
Waytowich, Nicholas R.
INTELLIGENT HUMAN SYSTEMS INTEGRATION 2019, 2019, 903 : 765 - 770
[26] Improving coordination with communication in multi-agent reinforcement learning
Szer, D
Charpillet, F
ICTAI 2004: 16TH IEEE INTERNATIONALCONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, : 436 - 440
[27] MAGNet: Multi-agent Graph Network for Deep Multi-agent Reinforcement Learning
Malysheva, Aleksandra
Kudenko, Daniel
Shpilman, Aleksei
2019 XVI INTERNATIONAL SYMPOSIUM PROBLEMS OF REDUNDANCY IN INFORMATION AND CONTROL SYSTEMS (REDUNDANCY), 2019, : 171 - 176
[28] Towards Pick and Place Multi Robot Coordination Using Multi-agent Deep Reinforcement Learning
Lan, Xi
Qiao, Yuansong
Lee, Brian
2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 85 - 89
[29] Improving the multi-agent coordination through learning
Patnaik, S
Konar, A
Mandal, AK
IETE JOURNAL OF RESEARCH, 2005, 51 (05) : 395 - 406
[30] Hierarchical Multi-Agent Deep Reinforcement Learning to Develop Long-Term Coordination
Ossenkopf, Marie
Jorgensen, Mackenzie
Geihs, Kurt
SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 922 - 929

← 1 2 3 4 5 →