Coordination as inference in multi-agent reinforcement learning

被引:0
|
作者
Li, Zhiyuan [1 ]
Wu, Lijun [1 ]
Su, Kaile [2 ]
Wu, Wei [3 ,4 ]
Jing, Yulin [1 ]
Wu, Tong [1 ]
Duan, Weiwei [1 ]
Yue, Xiaofeng [1 ]
Tong, Xiyi [5 ]
Han, Yizhou [6 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
[2] Griffith Univ, Sch Informat & Commun Technol, Brisbane, Australia
[3] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China
[4] Xiangjiang Lab, Changsha, Peoples R China
[5] Sichuan Univ Pittsburgh Inst, Chengdu, Peoples R China
[6] Univ Glasgow, Glasgow Int Coll, Glasgow City, Scotland
关键词
Multi-agent System; Deep reinforcement learning; Non-stationary; Variational inference; Causal inference; Theory of mind; MECHANISMS;
D O I
10.1016/j.neunet.2024.106101
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Centralized Training and Decentralized Execution (CTDE) paradigm, where a centralized critic is allowed to access global information during the training phase while maintaining the learned policies executed with only local information in a decentralized way, has achieved great progress in recent years. Despite the progress, CTDE may suffer from the issue of Centralized-Decentralized Mismatch (CDM): the suboptimality of one agent's policy can exacerbate policy learning of other agents through the centralized joint critic. In contrast to centralized learning, the cooperative model that most closely resembles the way humans cooperate in nature is fully decentralized, i.e. Independent Learning (IL). However, there are still two issues that need to be addressed before agents coordinate through IL: (1) how agents are aware of the presence of other agents, and (2) how to coordinate with other agents to improve joint policy under IL. In this paper, we propose an inference -based coordinated MARL method: Deep Motor System (DMS). DMS first presents the idea of individual intention inference where agents are allowed to disentangle other agents from their environment. Secondly, causal inference was introduced to enhance coordination by reasoning each agent's effect on others' behavior. The proposed model was extensively experimented on a series of Multi -Agent MuJoCo and StarCraftII tasks. Results show that the proposed method outperforms independent learning algorithms and the coordination behavior among agents can be learned even without the CTDE paradigm compared to the state-of-the-art baselines including IPPO and HAPPO.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Multi-agent Coordination using Reinforcement Learning with a Relay Agent
    Zemzem, Wiem
    Tagina, Moncef
    [J]. ICEIS: PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS - VOL 1, 2017, : 537 - 545
  • [2] Distributed Coordination Guidance in Multi-Agent Reinforcement Learning
    Lau, Qiangfeng Peter
    Lee, Mong Li
    Hsu, Wynne
    [J]. 2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 456 - 463
  • [3] Improving coordination with communication in multi-agent reinforcement learning
    Szer, D
    Charpillet, F
    [J]. ICTAI 2004: 16TH IEEE INTERNATIONALCONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, : 436 - 440
  • [4] Evaluating the Coordination of Agents in Multi-agent Reinforcement Learning
    Barton, Sean L.
    Zaroukian, Erin
    Asher, Derrik E.
    Waytowich, Nicholas R.
    [J]. INTELLIGENT HUMAN SYSTEMS INTEGRATION 2019, 2019, 903 : 765 - 770
  • [5] Reinforcement learning of coordination in cooperative multi-agent systems
    Kapetanakis, S
    Kudenko, D
    [J]. EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 326 - 331
  • [6] Reinforcement learning approaches to coordination in cooperative multi-agent systems
    Kapetanakis, S
    Kudenko, D
    Strens, MJA
    [J]. ADAPTIVE AGENTS AND MULTI-AGENT SYSTEMS: ADAPTATION AND MULTI-AGENT LEARNING, 2003, 2636 : 18 - 32
  • [7] Coordination Between Individual Agents in Multi-Agent Reinforcement Learning
    Zhang, Yang
    Yang, Qingyu
    An, Dou
    Zhang, Chengwei
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 11387 - 11394
  • [8] Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems
    Kapetanakis, S
    Kudenko, D
    [J]. ADAPTIVE AGENTS AND MULTI-AGENT SYSTEMS II: ADAPTATION AND MULTI-AGENT LEARNING, 2005, 3394 : 119 - 131
  • [9] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    [J]. 2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [10] Causal inference multi-agent reinforcement learning for traffic signal control
    Yang, Shantian
    Yang, Bo
    Zeng, Zheng
    Kang, Zhongfeng
    [J]. INFORMATION FUSION, 2023, 94 : 243 - 256