Promoting Coordination through Policy Regularization in Multi-Agent Deep Reinforcement Learning

被引:0
|
作者
Roy, Julien [1 ]
Barde, Paul [2 ]
Harvey, Felix G. [1 ]
Nowrouzezahrai, Derek [2 ]
Pal, Christopher [1 ,3 ]
机构
[1] Polytech Montreal, Quebec AI Inst Mila, Montreal, PQ, Canada
[2] McGill Univ, Quebec AI Inst Mila, Montreal, PQ, Canada
[3] Element AI, Montreal, PQ, Canada
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020 | 2020年 / 33卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multi-agent reinforcement learning, discovering successful collective behaviors is challenging as it requires exploring a joint action space that grows exponentially with the number of agents. While the tractability of independent agent-wise exploration is appealing, this approach fails on tasks that require elaborate group strategies. We argue that coordinating the agents' policies can guide their exploration and we investigate techniques to promote such an inductive bias. We propose two policy regularization methods: TeamReg, which is based on inter-agent action predictability and CoachReg that relies on synchronized behavior selection. We evaluate each approach on four challenging continuous control tasks with sparse rewards that require varying levels of coordination as well as on the discrete action Google Research Football environment. Our experiments show improved performance across many cooperative multi-agent problems. Finally, we analyze the effects of our proposed methods on the policies that our agents learn and show that our methods successfully enforce the qualities that we propose as proxies for coordinated behaviors.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Multi-agent deep reinforcement learning: a survey
    Gronauer, Sven
    Diepold, Klaus
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (02) : 895 - 943
  • [22] Learning to Communicate with Deep Multi-Agent Reinforcement Learning
    Foerster, Jakob N.
    Assael, Yannis M.
    de Freitas, Nando
    Whiteson, Shimon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [23] Distributed Coordination Guidance in Multi-Agent Reinforcement Learning
    Lau, Qiangfeng Peter
    Lee, Mong Li
    Hsu, Wynne
    2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 456 - 463
  • [24] Reinforcement learning of coordination in cooperative multi-agent systems
    Kapetanakis, S
    Kudenko, D
    EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 326 - 331
  • [25] Evaluating the Coordination of Agents in Multi-agent Reinforcement Learning
    Barton, Sean L.
    Zaroukian, Erin
    Asher, Derrik E.
    Waytowich, Nicholas R.
    INTELLIGENT HUMAN SYSTEMS INTEGRATION 2019, 2019, 903 : 765 - 770
  • [26] Improving coordination with communication in multi-agent reinforcement learning
    Szer, D
    Charpillet, F
    ICTAI 2004: 16TH IEEE INTERNATIONALCONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, : 436 - 440
  • [27] MAGNet: Multi-agent Graph Network for Deep Multi-agent Reinforcement Learning
    Malysheva, Aleksandra
    Kudenko, Daniel
    Shpilman, Aleksei
    2019 XVI INTERNATIONAL SYMPOSIUM PROBLEMS OF REDUNDANCY IN INFORMATION AND CONTROL SYSTEMS (REDUNDANCY), 2019, : 171 - 176
  • [28] Towards Pick and Place Multi Robot Coordination Using Multi-agent Deep Reinforcement Learning
    Lan, Xi
    Qiao, Yuansong
    Lee, Brian
    2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 85 - 89
  • [29] Improving the multi-agent coordination through learning
    Patnaik, S
    Konar, A
    Mandal, AK
    IETE JOURNAL OF RESEARCH, 2005, 51 (05) : 395 - 406
  • [30] Hierarchical Multi-Agent Deep Reinforcement Learning to Develop Long-Term Coordination
    Ossenkopf, Marie
    Jorgensen, Mackenzie
    Geihs, Kurt
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 922 - 929