Safe multi-agent reinforcement learning for multi-robot control

被引:18
|
作者
Gu, Shangding [1 ,4 ]
Kuba, Jakub Grudzien [2 ]
Chen, Yuanpei [4 ]
Du, Yali [3 ]
Yang, Long [4 ]
Knoll, Alois [1 ]
Yang, Yaodong [4 ]
机构
[1] Tech Univ Munich, Dept Comp Sci, Munich, Germany
[2] Univ Oxford, Dept Stat, Oxford, England
[3] Kings Coll London, Dept Informat, London, England
[4] Peking Univ, Inst Artificial Intelligence, Beijing, Peoples R China
关键词
Constrained Markov game; Constrained policy optimisation; Safe multi-agent benchmarks; Safe multi-robot control; NETWORKS;
D O I
10.1016/j.artint.2023.103905
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A challenging problem in robotics is how to control multiple robots cooperatively and safely in real-world applications. Yet, developing multi-robot control methods from the perspective of safe multi-agent reinforcement learning (MARL) has merely been studied. To fill this gap, in this study, we investigate safe MARL for multi-robot control on cooperative tasks, in which each individual robot has to not only meet its own safety constraints while maximising their reward, but also consider those of others to guarantee safe team behaviours. Firstly, we formulate the safe MARL problem as a constrained Markov game and employ policy optimisation to solve it theoretically. The proposed algorithm guarantees monotonic improvement in reward and satisfaction of safety constraints at every iteration. Secondly, as approximations to the theoretical solution, we propose two safe multi -agent policy gradient methods: Multi-Agent Constrained Policy Optimisation (MACPO) and MAPPO-Lagrangian. Thirdly, we develop the first three safe MARL benchmarks-Safe Multi -Agent MuJoCo (Safe MAMuJoCo), Safe Multi-Agent Robosuite (Safe MARobosuite) and Safe Multi-Agent Isaac Gym (Safe MAIG) to expand the toolkit of MARL and robot control research communities. Finally, experimental results on the three safe MARL benchmarks indicate that our methods can achieve state-of-the-art performance in the balance between improving reward and satisfying safety constraints compared with strong baselines. Demos and code are available at the link (https://sites .google .com /view /aij -safe -marl/).2Crown Copyright (c) 2023 Published by Elsevier B.V. All rights reserved.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] Heterogeneous Multi-Robot Cooperation With Asynchronous Multi-Agent Reinforcement Learning
    Zhang, Han
    Zhang, Xiaohui
    Feng, Zhao
    Xiao, Xiaohui
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (01): : 159 - 166
  • [2] Multi-Agent Deep Reinforcement Learning for Multi-Robot Applications: A Survey
    Orr, James
    Dutta, Ayan
    SENSORS, 2023, 23 (07)
  • [3] Distributed multi-agent deep reinforcement learning for cooperative multi-robot pursuit
    Yu, Chao
    Dong, Yinzhao
    Li, Yangning
    Chen, Yatong
    JOURNAL OF ENGINEERING-JOE, 2020, 2020 (13): : 499 - 504
  • [4] Decision Making for Multi-Robot Fixture Planning Using Multi-Agent Reinforcement Learning
    Canzini, Ethan
    Auledas-Noguera, Marc
    Pope, Simon
    Tiwari, Ashutosh
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024,
  • [5] NeuronsMAE: A Novel Multi-Agent Reinforcement Learning Environment for Cooperative and Competitive Multi-Robot Tasks
    Hu, Guangzheng
    Li, Haoran
    Liu, Shasha
    Zhu, Yuanheng
    Zhao, Dongbin
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [6] Expert Knowledge Based Multi-Agent Reinforcement Learning and Its Application in Multi-Robot Hunting Problem
    Wei, Zhanyang
    Zhang, Wanpeng
    Chen, Jing
    Yang, Zhen
    PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 2687 - 2692
  • [7] Multi-Agent Reinforcement Learning based on K-Means Clustering in Multi-Robot Cooperative Systems
    Liu Chang-an
    Liu Fei
    Liu Chun-yang
    Wu Hua
    OPTICAL, ELECTRONIC MATERIALS AND APPLICATIONS, PTS 1-2, 2011, 216 : 75 - 80
  • [8] A Multi-agent Architecture for Multi-robot Surveillance
    Vallejo, David
    Remagnino, Paolo
    Monekosso, Dorothy N.
    Jimenez, Luis
    Gonzalez, Carlos
    COMPUTATIONAL COLLECTIVE INTELLIGENCE: SEMANTIC WEB, SOCIAL NETWORKS AND MULTIAGENT SYSTEMS, 2009, 5796 : 266 - +
  • [9] Distributed safe reinforcement learning for multi-robot motion planning
    Lu, Yang
    Guo, Yaohua
    Zhao, Guoxiang
    Zhu, Minghui
    2021 29TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2021, : 1209 - 1214
  • [10] Shield Decentralization for Safe Multi-Agent Reinforcement Learning
    Melcer, Daniel
    Amato, Christopher
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,