A Distributional Perspective on Multiagent Cooperation With Deep Reinforcement Learning

被引:7
|
作者
Huang, Liwei [1 ,2 ]
Fu, Mingsheng [1 ]
Rao, Ananya [3 ]
Irissappane, Athirai A. [3 ]
Zhang, Jie [4 ]
Xu, Chengzhong [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610054, Peoples R China
[2] Univ Macau, State Key Lab IoTSC, Taipa 999078, Macao, Peoples R China
[3] Univ Washington, Sch Engn & Technol, Tacoma, WA 98402 USA
[4] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
基金
中国博士后科学基金;
关键词
Deep reinforcement learning (RL); distributional RL; multiagent system; neural network; LEVEL;
D O I
10.1109/TNNLS.2022.3202097
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Among various value decomposition-based multiagent reinforcement learning (MARL) algorithms, the overall performance of the multiagent system is represented by a scalar global Q value and optimized by minimizing the temporal difference (TD) error with respect to that global Q value. However, the global Q value cannot accurately model the distributed dynamics of the multiagent system, since it is only a simplified representation for different individual Q values of agents. To explicitly consider the correlations between different cooperative agents, in this article, we propose a distributional framework and construct a practical model called distributional multiagent cooperation (DMAC) from a novel distributional perspective. Specifically, in DMAC, we view the individual Q value for the executed action of a random agent as a value distribution, whose expectation can further represent the overall performance. Then, we employ distributional RL to minimize the difference between the estimated distribution and its target for the optimization. The advantage of DMAC is that the distributed dynamics of agents can be explicitly modeled, and this results in better performance. To verify the effectiveness of DMAC, we conduct extensive experiments under nine different scenarios of the StarCraft Multiagent Challenge (SMAC). Experimental results show that the DMAC can significantly outperform the baselines with respect to the average median test win rate.
引用
收藏
页码:4246 / 4259
页数:14
相关论文
共 50 条
  • [1] Multiagent cooperation and competition with deep reinforcement learning
    Tampuu, Ardi
    Matiisen, Tambet
    Kodelja, Dorian
    Kuzovkin, Ilya
    Korjus, Kristjan
    Aru, Juhan
    Aru, Jaan
    Vicente, Raul
    PLOS ONE, 2017, 12 (04):
  • [2] A Distributional Perspective on Reinforcement Learning
    Bellemare, Marc G.
    Dabney, Will
    Munos, Remi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [3] Measurement of Underlying Cooperation in Multiagent Reinforcement Learning
    Arai, Sachiyo
    Ishigaki, Yoshihisa
    Hirata, Hironori
    INTELLIGENT AGENTS AND MULTI-AGENT SYSTEMS, PROCEEDINGS, 2008, 5357 : 34 - 41
  • [4] Reinforcement learning for encouraging cooperation in a multiagent system
    Jiang, Wei-Cheng
    Huang, Hong-Hao
    Wang, Yu-Teng
    INFORMATION SCIENCES, 2024, 680
  • [5] A fully value distributional deep reinforcement learning framework for multi-agent cooperation
    Fu, Mingsheng
    Huang, Liwei
    Li, Fan
    Qu, Hong
    Xu, Chengzhong
    NEURAL NETWORKS, 2025, 184
  • [6] A survey and critique of multiagent deep reinforcement learning
    Pablo Hernandez-Leal
    Bilal Kartal
    Matthew E. Taylor
    Autonomous Agents and Multi-Agent Systems, 2019, 33 : 750 - 797
  • [7] Deep multiagent reinforcement learning: challenges and directions
    Annie Wong
    Thomas Bäck
    Anna V. Kononova
    Aske Plaat
    Artificial Intelligence Review, 2023, 56 : 5023 - 5056
  • [8] Distributional Deep Reinforcement Learning with a Mixture of Gaussians
    Choi, Yunho
    Lee, Kyungjae
    Oh, Songhwai
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 9791 - 9797
  • [9] Deep multiagent reinforcement learning: challenges and directions
    Wong, Annie
    Back, Thomas
    Kononova, Anna, V
    Plaat, Aske
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (06) : 5023 - 5056
  • [10] A survey and critique of multiagent deep reinforcement learning
    Hernandez-Leal, Pablo
    Kartal, Bilal
    Taylor, Matthew E.
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2019, 33 (06) : 750 - 797