Safe Multi-Agent Reinforcement Learning via Approximate Hamilton-Jacobi Reachability

被引:0
|
作者
Kai Zhu [1 ]
Fengbo Lan [1 ]
Wenbo Zhao [1 ]
Tao Zhang [1 ]
机构
[1] Tsinghua University,Department of Automation
[2] Beijing National Research Center for Information Science and Technology,undefined
关键词
Multi-agent systems; Deep reinforcement learning; Safety satisfaction; Hamilton-Jacobi reachability;
D O I
10.1007/s10846-024-02156-6
中图分类号
学科分类号
摘要
Multi-Agent Reinforcement Learning (MARL) promises to address the challenges of cooperation and competition among multiple agents, often involving safety-critical scenarios. However, realizing safe MARL remains a domain of limited progress. Current works extend single-agent safe learning approaches, employing shielding or backup policies to ensure safety satisfaction. Nevertheless, these approaches require good cooperation among multiple agents, and weakly distributed approaches with centralized shielding become infeasible when agents encounter complex situations such as non-cooperative agents and coordination failures. In this paper, we integrate the Hamilton-Jacobi (HJ) reachability theory and present a Centralized Training and Decentralized Execution (CTDE) framework for Safe MARL. Our framework enables the learning of safety policies without the need for system model or shielding layer pre-training. Additionally, we enhance adaptability to varying levels of cooperation through a conservative approximation estimation of the value function. Experimental results validate the efficacy of our proposed method, demonstrating its ability to ensure safety while successfully achieving target tasks under cooperative conditions. Furthermore, our approach exhibits robustness in the face of non-cooperative behaviors induced by complex disturbance factors.
引用
收藏
相关论文
共 50 条
  • [1] Safe Multi-Agent Reinforcement Learning via Approximate Hamilton-Jacobi Reachability
    Zhu, Kai
    Lan, Fengbo
    Zhao, Wenbo
    Zhang, Tao
    Journal of Intelligent and Robotic Systems: Theory and Applications, 111 (01):
  • [2] Hamilton-Jacobi Reachability in Reinforcement Learning: A Survey
    Ganai, Milan
    Gao, Sicun
    Herbert, Sylvia L.
    IEEE Open Journal of Control Systems, 2024, 3 : 310 - 324
  • [3] Hamilton-Jacobi Multi-Time Reachability
    Doshi, Manan
    Bhabra, Manmeet
    Wiggert, Marius
    Tomlin, Claire J.
    Lermusiaux, Pierre F. J.
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 2443 - 2450
  • [4] A Hamilton-Jacobi Formulation for Cooperative Control of Multi-Agent Systems
    Roozbehani, Hajir
    Rudaz, Sylvain
    Gillet, Denis
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 4813 - 4818
  • [5] Cooperative Control of Maglev Levitation System via Hamilton-Jacobi-Bellman Multi-Agent Deep Reinforcement Learning
    Zhu, Qi
    Wang, Su-Mei
    Ni, Yi-Qing
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (09) : 12747 - 12759
  • [6] Bridging Hamilton-Jacobi Safety Analysis and Reinforcement Learning
    Fisac, Jaime E.
    Lugovoy, Neil E.
    Rubies-Royo, Vicenc
    Ghosh, Shromona
    Tomlin, Claire J.
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 8550 - 8556
  • [7] Safe multi-agent motion planning via filtered reinforcement learning
    Vinod, Abraham P.
    Safaoui, Sleiman
    Chakrabarty, Ankush
    Quirynen, Rien
    Yoshikawa, Nobuyuki
    Di Cairano, Stefano
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 7270 - 7276
  • [8] Multi-Vehicle Collision Avoidance via Hamilton-Jacobi Reachability and Mixed Integer Programming
    Chen, Mo
    Shih, Jennifer C.
    Tomlin, Claire J.
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 1695 - 1700
  • [9] Hamilton-Jacobi Reachability Safety Filter Applications
    不详
    IEEE CONTROL SYSTEMS MAGAZINE, 2023, 43 (05): : 148 - 148
  • [10] Formal Reachability Analysis for Multi-Agent Reinforcement Learning Systems
    Wang, Xiaoyan
    Peng, Jun
    Li, Shuqiu
    Li, Bing
    IEEE ACCESS, 2021, 9 : 45812 - 45821