Safe multi-agent reinforcement learning for multi-robot control

被引:18
|
作者
Gu, Shangding [1 ,4 ]
Kuba, Jakub Grudzien [2 ]
Chen, Yuanpei [4 ]
Du, Yali [3 ]
Yang, Long [4 ]
Knoll, Alois [1 ]
Yang, Yaodong [4 ]
机构
[1] Tech Univ Munich, Dept Comp Sci, Munich, Germany
[2] Univ Oxford, Dept Stat, Oxford, England
[3] Kings Coll London, Dept Informat, London, England
[4] Peking Univ, Inst Artificial Intelligence, Beijing, Peoples R China
关键词
Constrained Markov game; Constrained policy optimisation; Safe multi-agent benchmarks; Safe multi-robot control; NETWORKS;
D O I
10.1016/j.artint.2023.103905
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A challenging problem in robotics is how to control multiple robots cooperatively and safely in real-world applications. Yet, developing multi-robot control methods from the perspective of safe multi-agent reinforcement learning (MARL) has merely been studied. To fill this gap, in this study, we investigate safe MARL for multi-robot control on cooperative tasks, in which each individual robot has to not only meet its own safety constraints while maximising their reward, but also consider those of others to guarantee safe team behaviours. Firstly, we formulate the safe MARL problem as a constrained Markov game and employ policy optimisation to solve it theoretically. The proposed algorithm guarantees monotonic improvement in reward and satisfaction of safety constraints at every iteration. Secondly, as approximations to the theoretical solution, we propose two safe multi -agent policy gradient methods: Multi-Agent Constrained Policy Optimisation (MACPO) and MAPPO-Lagrangian. Thirdly, we develop the first three safe MARL benchmarks-Safe Multi -Agent MuJoCo (Safe MAMuJoCo), Safe Multi-Agent Robosuite (Safe MARobosuite) and Safe Multi-Agent Isaac Gym (Safe MAIG) to expand the toolkit of MARL and robot control research communities. Finally, experimental results on the three safe MARL benchmarks indicate that our methods can achieve state-of-the-art performance in the balance between improving reward and satisfying safety constraints compared with strong baselines. Demos and code are available at the link (https://sites .google .com /view /aij -safe -marl/).2Crown Copyright (c) 2023 Published by Elsevier B.V. All rights reserved.
引用
收藏
页数:24
相关论文
共 50 条
  • [41] Safe multi-agent motion planning via filtered reinforcement learning
    Vinod, Abraham P.
    Safaoui, Sleiman
    Chakrabarty, Ankush
    Quirynen, Rien
    Yoshikawa, Nobuyuki
    Di Cairano, Stefano
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 7270 - 7276
  • [42] Hierarchical multi-agent reinforcement learning
    Mohammad Ghavamzadeh
    Sridhar Mahadevan
    Rajbala Makar
    Autonomous Agents and Multi-Agent Systems, 2006, 13 : 197 - 229
  • [43] Control Fusion for Safe Multi-Robot Coordination
    Bostelman, Roger
    Marvel, Jeremy
    MULTISENSOR, MULTISOURCE INFORMATION FUSION: ARCHITECTURES, ALGORITHMS, AND APPLICATIONS 2014, 2014, 9121
  • [44] Decision-Making Under Uncertainty in Multi-Agent and Multi-Robot Systems: Planning and Learning
    Amato, Christopher
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5662 - 5666
  • [45] A deep reinforcement learning approach for multi-agent mobile robot patrolling
    Jana, Meghdeep
    Vachhani, Leena
    Sinha, Arpita
    INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS, 2022, 6 (04) : 724 - 745
  • [46] Leveraging Expert Demonstrations in Robot Cooperation with Multi-Agent Reinforcement Learning
    Zhang, Zhaolong
    Li, Yihui
    Rojas, Juan
    Guan, Yisheng
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2021, PT II, 2021, 13014 : 211 - 222
  • [47] Multi-agent reinforcement learning: A survey
    Busoniu, Lucian
    Babuska, Robert
    De Schutter, Bart
    2006 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1- 5, 2006, : 1133 - +
  • [48] The Dynamics of Multi-Agent Reinforcement Learning
    Dickens, Luke
    Broda, Krysia
    Russo, Alessandra
    ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2010, 215 : 367 - 372
  • [49] Multi-agent Exploration with Reinforcement Learning
    Sygkounas, Alkis
    Tsipianitis, Dimitris
    Nikolakopoulos, George
    Bechlioulis, Charalampos P.
    2022 30TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2022, : 630 - 635
  • [50] Multi-Agent Reinforcement Learning for Microgrids
    Dimeas, A. L.
    Hatziargyriou, N. D.
    IEEE POWER AND ENERGY SOCIETY GENERAL MEETING 2010, 2010,