Safe multi-agent reinforcement learning for multi-robot control

被引:18
|
作者
Gu, Shangding [1 ,4 ]
Kuba, Jakub Grudzien [2 ]
Chen, Yuanpei [4 ]
Du, Yali [3 ]
Yang, Long [4 ]
Knoll, Alois [1 ]
Yang, Yaodong [4 ]
机构
[1] Tech Univ Munich, Dept Comp Sci, Munich, Germany
[2] Univ Oxford, Dept Stat, Oxford, England
[3] Kings Coll London, Dept Informat, London, England
[4] Peking Univ, Inst Artificial Intelligence, Beijing, Peoples R China
关键词
Constrained Markov game; Constrained policy optimisation; Safe multi-agent benchmarks; Safe multi-robot control; NETWORKS;
D O I
10.1016/j.artint.2023.103905
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A challenging problem in robotics is how to control multiple robots cooperatively and safely in real-world applications. Yet, developing multi-robot control methods from the perspective of safe multi-agent reinforcement learning (MARL) has merely been studied. To fill this gap, in this study, we investigate safe MARL for multi-robot control on cooperative tasks, in which each individual robot has to not only meet its own safety constraints while maximising their reward, but also consider those of others to guarantee safe team behaviours. Firstly, we formulate the safe MARL problem as a constrained Markov game and employ policy optimisation to solve it theoretically. The proposed algorithm guarantees monotonic improvement in reward and satisfaction of safety constraints at every iteration. Secondly, as approximations to the theoretical solution, we propose two safe multi -agent policy gradient methods: Multi-Agent Constrained Policy Optimisation (MACPO) and MAPPO-Lagrangian. Thirdly, we develop the first three safe MARL benchmarks-Safe Multi -Agent MuJoCo (Safe MAMuJoCo), Safe Multi-Agent Robosuite (Safe MARobosuite) and Safe Multi-Agent Isaac Gym (Safe MAIG) to expand the toolkit of MARL and robot control research communities. Finally, experimental results on the three safe MARL benchmarks indicate that our methods can achieve state-of-the-art performance in the balance between improving reward and satisfying safety constraints compared with strong baselines. Demos and code are available at the link (https://sites .google .com /view /aij -safe -marl/).2Crown Copyright (c) 2023 Published by Elsevier B.V. All rights reserved.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] Multi-agent Reinforcement Learning for Traffic Signal Control
    Prabuchandran, K. J.
    Kumar, Hemanth A. N.
    Bhatnagar, Shalabh
    2014 IEEE 17TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2014, : 2529 - 2534
  • [32] MARLYC: Multi-Agent Reinforcement Learning Yaw Control
    Kadoche, Elie
    Gourvenec, Sebastien
    Pallud, Maxime
    Levent, Tanguy
    RENEWABLE ENERGY, 2023, 217
  • [33] Dynamic Multi-Agent Reinforcement Learning for Control Optimization
    Fagan, Derek
    Meier, Rene
    PROCEEDINGS FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, MODELLING AND SIMULATION, 2014, : 99 - 104
  • [34] Cranes control using multi-agent reinforcement learning
    Arai, S
    Miyazaki, K
    Kobayashi, S
    INTELLIGENT AUTONOMOUS SYSTEMS: IAS-5, 1998, : 335 - 342
  • [35] Multi-Agent Cognition Difference Reinforcement Learning for Multi-Agent Cooperation
    Wang, Huimu
    Qiu, Tenghai
    Liu, Zhen
    Pu, Zhiqiang
    Yi, Jianqiang
    Yuan, Wanmai
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [36] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
    Chen, Hao
    Yang, Guangkai
    Zhang, Junge
    Yin, Qiyue
    Huang, Kaiqi
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [37] Multi-Agent Reinforcement Learning With Distributed Targeted Multi-Agent Communication
    Xu, Chi
    Zhang, Hui
    Zhang, Ya
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2915 - 2920
  • [38] Cooperative Multi-Robot Hierarchical Reinforcement Learning
    Setyawan, Gembong Edhi
    Hartono, Pitoyo
    Sawada, Hideyuki
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (09) : 35 - 44
  • [39] Barrier function based safe reinforcement learning for multi-agent systems
    Yao, Ying
    Zhang, Dianfeng
    Wu, Zhaojing
    Shao, Guangru
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 1714 - 1721
  • [40] Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems
    Riley, Joshua
    Calinescu, Radu
    Paterson, Colin
    Kudenko, Daniel
    Banks, Alec
    AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2021, 2022, 13251 : 158 - 180