A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement Learning

被引:2
|
作者
Fu, Qingxu [1 ,2 ]
Qiu, Tenghai [1 ,2 ]
Pu, Zhiqiang [1 ,2 ]
Yi, Jianqiang [1 ,2 ]
Yuan, Wanmai [3 ]
机构
[1] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
[3] Corp Informat Sci Acad China, Elect Technol Grp, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
multiagent system; reinforcement learning; sparse reward;
D O I
10.1109/IJCNN55064.2022.9891991
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiagent reinforcement learning (MARL) can solve complex cooperative tasks. However, the efficiency of existing MARL methods relies heavily on well-defined reward functions. Multiagent tasks with sparse reward feedback are especially challenging not only because of the credit distribution problem, but also due to the low probability of obtaining positive reward feedback. In this paper, we design a graph network called Cooperation Graph (CG). The Cooperation Graph is the combination of two simple bipartite graphs, namely, the Agent Clustering subgraph (ACG) and the Cluster Designating subgraph (CDG). Next, based on this novel graph structure, we propose a Cooperation Graph Multiagent Reinforcement Learning (CG-MARL) algorithm, which can efficiently deal with the sparse reward problem in multiagent tasks. In CG-MARL, agents are directly controlled by the Cooperation Graph. And a policy neural network is trained to manipulate this Cooperation Graph, guiding agents to achieve cooperation in an implicit way. This hierarchical feature of CG-MARL provides space for customized cluster-actions, an extensible interface for introducing fundamental cooperation knowledge. In experiments, CG-MARL shows state-of-the-art performance in sparse reward multiagent benchmarks, including the anti-invasion interception task and the multi-cargo delivery task.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Graph convolutional recurrent networks for reward shaping in reinforcement learning
    Sami, Hani
    Bentahar, Jamal
    Mourad, Azzam
    Otrok, Hadi
    Damiani, Ernesto
    [J]. INFORMATION SCIENCES, 2022, 608 : 63 - 80
  • [22] A Multiagent Reinforcement Learning Control Approach to Environment Exploration
    Imtiaz, Mohammad S.
    Wang, Jing
    [J]. SOUTHEASTCON 2017, 2017,
  • [24] VGN: Value Decomposition With Graph Attention Networks for Multiagent Reinforcement Learning
    Wei, Qinglai
    Li, Yugu
    Zhang, Jie
    Wang, Fei-Yue
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (01) : 182 - 195
  • [25] A multiagent reinforcement learning approach based on different states
    李珺
    潘启树
    [J]. Journal of Harbin Institute of Technology(New series), 2010, (03) : 419 - 423
  • [26] Graph Based Optimization For Multiagent Cooperation
    Singh, Arambam James
    Kumar, Akshat
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1497 - 1505
  • [27] Robotic Control in Adversarial and Sparse Reward Environments: A Robust Goal-Conditioned Reinforcement Learning Approach
    He, Xiangkun
    Lv, Chen
    [J]. IEEE Transactions on Artificial Intelligence, 2024, 5 (01): : 244 - 253
  • [28] Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks
    Sohn, Sungryull
    Lee, Sungtae
    Choi, Jongwook
    van Seijen, Harm
    Fatemi, Mehdi
    Lee, Honglak
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [29] Sparse reward for reinforcement learning-based continuous integration testing
    Yang, Yang
    Li, Zheng
    Shang, Ying
    Li, Qianyu
    [J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2023, 35 (06)
  • [30] Reinforcement Learning in Sparse-Reward Environments With Hindsight Policy Gradients
    Rauber, Paulo
    Ummadisingu, Avinash
    Mutz, Filipe
    Schmidhuber, Juergen
    [J]. NEURAL COMPUTATION, 2021, 33 (06) : 1498 - 1553