MRRC: Multi-agent Reinforcement Learning with Rectification Capability in Cooperative Tasks

被引:0
|
作者
Yu, Sheng [1 ]
Zhu, Wei [1 ]
Liu, Shuhong [1 ]
Gong, Zhengwen [1 ]
Chen, Haoran [1 ]
机构
[1] Natl Univ Def Technol, Sch Informat & Commun, Wuhan 430014, Peoples R China
关键词
Multi-agent reinforcement learning; Cooperative task; Individual reward rectification; Monotonic mix function;
D O I
10.1007/978-981-99-8082-6_16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Motivated by the centralised training with decentralised execution (CTDE) paradigm, multi-agent reinforcement learning (MARL) algorithms have made significant strides in addressing cooperative tasks. However, the challenges of sparse environmental rewards and limited scalability have impeded further advancements in MARL. In response, MRRC, a novel actor-critic-based approach is proposed. MRRC tackles the sparse reward problem by equipping each agent with both an individual policy and a cooperative policy, harnessing the benefits of the individual policy's rapid convergence and the cooperative policy's global optimality. To enhance scalability, MRRC employs a monotonic mix network to rectify the state-action value function Q for each agent, yielding the joint value function Qtot to facilitate global updates of the entire critic network. Additionally, the Gumbel-Softmax technique is introduced to rectify discrete actions, enabling MRRC to handle discrete tasks effectively. By comparing MRRC with advanced baseline algorithms in the "Predator-Prey" and challenging "SMAC" environments, as well as conducting ablation experiments, the superior performance of MRRC is demonstrated in this study. The experimental results reveal the efficacy of MRRC in reward-sparse environments and its ability to scale well with increasing numbers of agents.
引用
收藏
页码:204 / 218
页数:15
相关论文
共 50 条
  • [1] Knowledge Reuse of Multi-Agent Reinforcement Learning in Cooperative Tasks
    Shi, Daming
    Tong, Junbo
    Liu, Yi
    Fan, Wenhui
    [J]. ENTROPY, 2022, 24 (04)
  • [2] WRFMR: A Multi-Agent Reinforcement Learning Method for Cooperative Tasks
    Liu, Hui
    Zhang, Zhen
    Wang, Dongqing
    [J]. IEEE ACCESS, 2020, 8 : 216320 - 216331
  • [3] Credit assignment in heterogeneous multi-agent reinforcement learning for fully cooperative tasks
    Jiang, Kun
    Liu, Wenzhang
    Wang, Yuanda
    Dong, Lu
    Sun, Changyin
    [J]. APPLIED INTELLIGENCE, 2023, 53 (23) : 29205 - 29222
  • [4] Credit assignment in heterogeneous multi-agent reinforcement learning for fully cooperative tasks
    Kun Jiang
    Wenzhang Liu
    Yuanda Wang
    Lu Dong
    Changyin Sun
    [J]. Applied Intelligence, 2023, 53 : 29205 - 29222
  • [5] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
    Chen, Hao
    Yang, Guangkai
    Zhang, Junge
    Yin, Qiyue
    Huang, Kaiqi
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [6] On the Robustness of Cooperative Multi-Agent Reinforcement Learning
    Lin, Jieyu
    Dzeparoska, Kristina
    Zhang, Sai Qian
    Leon-Garcia, Alberto
    Papernot, Nicolas
    [J]. 2020 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2020), 2020, : 62 - 68
  • [7] Consensus Learning for Cooperative Multi-Agent Reinforcement Learning
    Xu, Zhiwei
    Zhang, Bin
    Li, Dapeng
    Zhang, Zeren
    Zhou, Guangchong
    Chen, Hao
    Fan, Guoliang
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10, 2023, : 11726 - 11734
  • [8] An Efficient Centralized Multi-Agent Reinforcement Learner for Cooperative Tasks
    Liao, Dengyu
    Zhang, Zhen
    Song, Tingting
    Liu, Mingyang
    [J]. IEEE ACCESS, 2023, 11 : 139284 - 139294
  • [9] Hierarchical multi-agent reinforcement learning for cooperative tasks with sparse rewards in continuous domain
    Cao, Jingyu
    Dong, Lu
    Yuan, Xin
    Wang, Yuanda
    Sun, Changyin
    [J]. NEURAL COMPUTING & APPLICATIONS, 2024, 36 (01): : 273 - 287
  • [10] Hierarchical multi-agent reinforcement learning for cooperative tasks with sparse rewards in continuous domain
    Jingyu Cao
    Lu Dong
    Xin Yuan
    Yuanda Wang
    Changyin Sun
    [J]. Neural Computing and Applications, 2024, 36 : 273 - 287