PRACM: Predictive Rewards for Actor-Critic with Mixing Function in Multi-Agent Reinforcement Learning

被引:0
|
作者
Yu, Sheng [1 ]
Liu, Bo [1 ]
Zhu, Wei [1 ]
Liu, Shuhong [1 ]
机构
[1] Natl Univ Def Technol, Sch Informat & Commun, Wuhan 430014, Peoples R China
关键词
Multi-agent reinforcement learning; Discrete action; Collaborative task; Mixing function; Predictive reward;
D O I
10.1007/978-3-031-40292-0_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inspired by the centralised training with decentralised execution (CTDE) paradigm, the field of multi-agent reinforcement learning (MARL) has made significant progress in tackling cooperative problems with discrete action spaces. Nevertheless, many existing algorithms suffer from significant performance degradation when faced with large numbers of agents or more challenging tasks. Furthermore, some specific scenarios, such as cooperative environments with penalties, pose significant challenges to these algorithms, which often lack sufficient cooperative behavior to converge successfully. A new approach, called PRACM, based on the Actor-Critic framework is proposed in this study to address these issues. PRACM employs a monotonic mixing function to generate a global action value function, Qtot, which is used to compute the loss function for updating the critic network. To handle the discrete action space, PRACM uses Gumbel-Softmax. And to promote cooperation among agents and to adapt to cooperative environments with penalties, the predictive rewards is introduced. PRACM was evaluated against several baseline algorithms in "Cooperative Predator-Prey" and the challenging "SMAC" scenarios. The results of this study illustrate that PRACM scales well as the number of agents and task difficulty increase, and performs better in cooperative tasks with penalties, demonstrating its usefulness in promoting collaboration among agents.
引用
收藏
页码:69 / 82
页数:14
相关论文
共 50 条
  • [31] Deployment Algorithm of Service Function Chain Based on Multi-Agent Soft Actor-Critic Learning
    Tang, Lun
    Li, Shirui
    Du, Yucong
    Chen, Qianbin
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (08) : 2893 - 2901
  • [32] AHAC: Actor Hierarchical Attention Critic for Multi-Agent Reinforcement Learning
    Wang, Yajie
    Shi, Dianxi
    Xue, Chao
    Jiang, Hao
    Wang, Gongju
    Gong, Peng
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3013 - 3020
  • [33] Multi-agent Gradient-Based Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning
    Ren, Jineng
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [34] Deep Reinforcement Learning-Based Multi-Agent System with Advanced Actor-Critic Framework for Complex Environment
    Cui, Zihao
    Deng, Kailian
    Zhang, Hongtao
    Zha, Zhongyi
    Jobaer, Sayed
    MATHEMATICS, 2025, 13 (05)
  • [35] A New Advantage Actor-Critic Algorithm For Multi-Agent Environments
    Paczolay, Gabor
    Harmati, Istvan
    2020 23RD IEEE INTERNATIONAL SYMPOSIUM ON MEASUREMENT AND CONTROL IN ROBOTICS (ISMCR), 2020,
  • [36] Improving sample efficiency in Multi-Agent Actor-Critic methods
    Ye, Zhenhui
    Chen, Yining
    Jiang, Xiaohong
    Song, Guanghua
    Yang, Bowei
    Fan, Sheng
    APPLIED INTELLIGENCE, 2022, 52 (04) : 3691 - 3704
  • [37] Multi-agent actor-critic with time dynamical opponent model
    Tian, Yuan
    Kladny, Klaus -Rudolf
    Wang, Qin
    Huang, Zhiwu
    Fink, Olga
    NEUROCOMPUTING, 2023, 517 : 165 - 172
  • [38] Forward Actor-Critic for Nonlinear Function Approximation in Reinforcement Learning
    Veeriah, Vivek
    van Seijen, Harm
    Sutton, Richard S.
    AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 556 - 564
  • [39] Multi-Agent Actor-Critic with Hierarchical Graph Attention Network
    Ryu, Heechang
    Shin, Hayong
    Park, Jinkyoo
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7236 - 7243
  • [40] Improving sample efficiency in Multi-Agent Actor-Critic methods
    Zhenhui Ye
    Yining Chen
    Xiaohong Jiang
    Guanghua Song
    Bowei Yang
    Sheng Fan
    Applied Intelligence, 2022, 52 : 3691 - 3704