Safe multi-agent reinforcement learning for multi-robot control

被引：18

作者：

Gu, Shangding ^{[1
,4
]}

Kuba, Jakub Grudzien ^{[2
]}

Chen, Yuanpei ^{[4
]}

Du, Yali ^{[3
]}

Yang, Long ^{[4
]}

Knoll, Alois ^{[1
]}

Yang, Yaodong ^{[4
]}

机构：

[1] Tech Univ Munich, Dept Comp Sci, Munich, Germany

[2] Univ Oxford, Dept Stat, Oxford, England

[3] Kings Coll London, Dept Informat, London, England

[4] Peking Univ, Inst Artificial Intelligence, Beijing, Peoples R China

来源：

ARTIFICIAL INTELLIGENCE | 2023年 / 319卷

关键词：

Constrained Markov game; Constrained policy optimisation; Safe multi-agent benchmarks; Safe multi-robot control; NETWORKS;

D O I：

10.1016/j.artint.2023.103905

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A challenging problem in robotics is how to control multiple robots cooperatively and safely in real-world applications. Yet, developing multi-robot control methods from the perspective of safe multi-agent reinforcement learning (MARL) has merely been studied. To fill this gap, in this study, we investigate safe MARL for multi-robot control on cooperative tasks, in which each individual robot has to not only meet its own safety constraints while maximising their reward, but also consider those of others to guarantee safe team behaviours. Firstly, we formulate the safe MARL problem as a constrained Markov game and employ policy optimisation to solve it theoretically. The proposed algorithm guarantees monotonic improvement in reward and satisfaction of safety constraints at every iteration. Secondly, as approximations to the theoretical solution, we propose two safe multi -agent policy gradient methods: Multi-Agent Constrained Policy Optimisation (MACPO) and MAPPO-Lagrangian. Thirdly, we develop the first three safe MARL benchmarks-Safe Multi -Agent MuJoCo (Safe MAMuJoCo), Safe Multi-Agent Robosuite (Safe MARobosuite) and Safe Multi-Agent Isaac Gym (Safe MAIG) to expand the toolkit of MARL and robot control research communities. Finally, experimental results on the three safe MARL benchmarks indicate that our methods can achieve state-of-the-art performance in the balance between improving reward and satisfying safety constraints compared with strong baselines. Demos and code are available at the link (https://sites .google .com /view /aij -safe -marl/).2Crown Copyright (c) 2023 Published by Elsevier B.V. All rights reserved.

引用

页数：24

共 50 条

[31] Multi-agent Reinforcement Learning for Traffic Signal Control
Prabuchandran, K. J.
Kumar, Hemanth A. N.
Bhatnagar, Shalabh
2014 IEEE 17TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2014, : 2529 - 2534
[32] MARLYC: Multi-Agent Reinforcement Learning Yaw Control
Kadoche, Elie
Gourvenec, Sebastien
Pallud, Maxime
Levent, Tanguy
RENEWABLE ENERGY, 2023, 217
[33] Dynamic Multi-Agent Reinforcement Learning for Control Optimization
Fagan, Derek
Meier, Rene
PROCEEDINGS FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, MODELLING AND SIMULATION, 2014, : 99 - 104
[34] Cranes control using multi-agent reinforcement learning
Arai, S
Miyazaki, K
Kobayashi, S
INTELLIGENT AUTONOMOUS SYSTEMS: IAS-5, 1998, : 335 - 342
[35] Multi-Agent Cognition Difference Reinforcement Learning for Multi-Agent Cooperation
Wang, Huimu
Qiu, Tenghai
Liu, Zhen
Pu, Zhiqiang
Yi, Jianqiang
Yuan, Wanmai
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[36] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
Chen, Hao
Yang, Guangkai
Zhang, Junge
Yin, Qiyue
Huang, Kaiqi
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[37] Multi-Agent Reinforcement Learning With Distributed Targeted Multi-Agent Communication
Xu, Chi
Zhang, Hui
Zhang, Ya
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2915 - 2920
[38] Cooperative Multi-Robot Hierarchical Reinforcement Learning
Setyawan, Gembong Edhi
Hartono, Pitoyo
Sawada, Hideyuki
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (09) : 35 - 44
[39] Barrier function based safe reinforcement learning for multi-agent systems
Yao, Ying
Zhang, Dianfeng
Wu, Zhaojing
Shao, Guangru
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 1714 - 1721
[40] Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems
Riley, Joshua
Calinescu, Radu
Paterson, Colin
Kudenko, Daniel
Banks, Alec
AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2021, 2022, 13251 : 158 - 180

← 1 2 3 4 5 →