Communication-robust multi-agent learning by adaptable auxiliary multi-agent adversary generation

被引:0
|
作者
Yuan, Lei [1 ,2 ]
Chen, Feng [1 ]
Zhang, Zongzhang [1 ]
Yu, Yang [1 ,2 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[2] Polixir Technol, Nanjing 211106, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
multi-agent communication; adversarial training; robustness validation; reinforcement learning;
D O I
10.1007/s11704-023-2733-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Communication can promote coordination in cooperative Multi-Agent Reinforcement Learning (MARL). Nowadays, existing works mainly focus on improving the communication efficiency of agents, neglecting that real-world communication is much more challenging as there may exist noise or potential attackers. Thus the robustness of the communication-based policies becomes an emergent and severe issue that needs more exploration. In this paper, we posit that the ego system1) trained with auxiliary adversaries may handle this limitation and propose an adaptable method of Multi-Agent Auxiliary Adversaries Generation for robust Communication, dubbed MA3C, to obtain a robust communication-based policy. In specific, we introduce a novel message-attacking approach that models the learning of the auxiliary attacker as a cooperative problem under a shared goal to minimize the coordination ability of the ego system, with which every information channel may suffer from distinct message attacks. Furthermore, as naive adversarial training may impede the generalization ability of the ego system, we design an attacker population generation approach based on evolutionary learning. Finally, the ego system is paired with an attacker population and then alternatively trained against the continuously evolving attackers to improve its robustness, meaning that both the ego system and the attackers are adaptable. Extensive experiments on multiple benchmarks indicate that our proposed MA3C provides comparable or better robustness and generalization ability than other baselines.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Robust multi-agent reinforcement learning for noisy environments
    Xinning Chen
    Xuan Liu
    Canhui Luo
    Jiangjin Yin
    [J]. Peer-to-Peer Networking and Applications, 2022, 15 : 1045 - 1056
  • [32] Agendas for multi-agent learning
    Gordon, Geoffrey J.
    [J]. ARTIFICIAL INTELLIGENCE, 2007, 171 (07) : 392 - 401
  • [33] Learning multi-agent cooperation
    Rivera, Corban
    Staley, Edward
    Llorens, Ashley
    [J]. FRONTIERS IN NEUROROBOTICS, 2022, 16
  • [34] Attritable Multi-Agent Learning
    Cybenko, George
    Hallman, Roger
    [J]. DISRUPTIVE TECHNOLOGIES IN INFORMATION SCIENCES V, 2021, 11751
  • [35] Learning Efficient and Robust Multi-Agent Communication via Graph Information Bottleneck
    Ding, Shifei
    Du, Wei
    Ding, Ling
    Guo, Lili
    Zhang, Jian
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17346 - 17353
  • [36] Targeted Multi-Agent Communication with Deep Metric Learning
    Miao, Hua
    Yu, Nanxiang
    [J]. ENGINEERING LETTERS, 2023, 31 (02) : 712 - 723
  • [37] Multi-agent reinforcement learning based on local communication
    Zhang, Wenxu
    Ma, Lei
    Li, Xiaonan
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 6): : 15357 - 15366
  • [38] Multi-Agent Deep Reinforcement Learning with Emergent Communication
    Simoes, David
    Lau, Nuno
    Reis, Luis Paulo
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [39] Learning Individually Inferred Communication for Multi-Agent Cooperation
    Ding, Ziluo
    Huang, Tiejun
    Lu, Zongqing
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [40] Multi-Agent Path Finding with Prioritized Communication Learning
    Li, Wenhao
    Chen, Hongjun
    Jin, Bo
    Tan, Wenzhe
    Zha, Hongyuan
    Wang, Xiangfeng
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 10695 - 10701