Intrinsic Action Tendency Consistency for Cooperative Multi-Agent Reinforcement Learning

被引:0
|
作者
Zhang, Junkai [1 ,2 ]
Zhang, Yifan [1 ,3 ,4 ]
Zhang, Xi Sheryl [1 ,3 ,4 ]
Zang, Yifan [1 ,2 ]
Cheng, Jian [1 ,3 ,4 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Nanjing, Peoples R China
[4] Nanjing Artificial Intelligence Res AI, Nanjing, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Efficient collaboration in the centralized training with decentralized execution (CTDE) paradigm remains a challenge in cooperative multi-agent systems. We identify divergent action tendencies among agents as a significant obstacle to CTDE's training efficiency, requiring a large number of training samples to achieve a unified consensus on agents' policies. This divergence stems from the lack of adequate team consensus-related guidance signals during credit assignments in CTDE. To address this, we propose Intrinsic Action Tendency Consistency, a novel approach for cooperative multi-agent reinforcement learning. It integrates intrinsic rewards, obtained through an action model, into a reward-additive CTDE (RA-CTDE) framework. We formulate an action model that enables surrounding agents to predict the central agent's action tendency. Leveraging these predictions, we compute a cooperative intrinsic reward that encourages agents to match their actions with their neighbors' predictions. We establish the equivalence between RA-CTDE and CTDE through theoretical analyses, demonstrating that CTDE's training process can be achieved using agents' individual targets. Building on this insight, we introduce a novel method to combine intrinsic rewards and CTDE. Extensive experiments on challenging tasks in SMAC and GRF benchmarks showcase the improved performance of our method.
引用
收藏
页码:17600 / 17608
页数:9
相关论文
共 50 条
  • [21] Cooperative multi-agent game based on reinforcement learning
    Liu, Hongbo
    [J]. HIGH-CONFIDENCE COMPUTING, 2024, 4 (01):
  • [22] Reinforcement learning of coordination in cooperative multi-agent systems
    Kapetanakis, S
    Kudenko, D
    [J]. EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 326 - 331
  • [23] Training Cooperative Agents for Multi-Agent Reinforcement Learning
    Bhalla, Sushrut
    Subramanian, Sriram G.
    Crowley, Mark
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1826 - 1828
  • [24] Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
    Liu, Iou-Jen
    Jain, Unnat
    Yeh, Raymond A.
    Schwing, Alexander G.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [25] Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning
    Jacopo Castellini
    Frans A. Oliehoek
    Rahul Savani
    Shimon Whiteson
    [J]. Autonomous Agents and Multi-Agent Systems, 2021, 35
  • [26] Baselines for joint-action reinforcement learning of coordination in cooperative multi-agent systems
    Carpenter, M
    Kudenko, D
    [J]. ADAPTIVE AGENTS AND MULTI-AGENT SYSTEMS II: ADAPTATION AND MULTI-AGENT LEARNING, 2005, 3394 : 55 - 72
  • [27] Cooperative Action Acquisition Based on Intention Estimation in a Multi-Agent Reinforcement Learning System
    Tsubakimoto, Tatsuya
    Kobayashi, Kunikazu
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2017, 100 (06) : 3 - 10
  • [28] Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning
    Castellini, Jacopo
    Oliehoek, Frans A.
    Savani, Rahul
    Whiteson, Shimon
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2021, 35 (02)
  • [29] LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning
    Du, Yali
    Han, Lei
    Fang, Meng
    Dai, Tianhong
    Liu, Ji
    Tao, Dacheng
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [30] Explainable Action Advising for Multi-Agent Reinforcement Learning
    Guo, Yue
    Campbell, Joseph
    Stepputtis, Simon
    Li, Ruiyu
    Hughes, Dana
    Fang, Fei
    Sycara, Katia
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 5515 - 5521