Safe Multi-Agent Reinforcement Learning via Approximate Hamilton-Jacobi Reachability

被引:0
|
作者
Kai Zhu [1 ]
Fengbo Lan [1 ]
Wenbo Zhao [1 ]
Tao Zhang [1 ]
机构
[1] Tsinghua University,Department of Automation
[2] Beijing National Research Center for Information Science and Technology,undefined
关键词
Multi-agent systems; Deep reinforcement learning; Safety satisfaction; Hamilton-Jacobi reachability;
D O I
10.1007/s10846-024-02156-6
中图分类号
学科分类号
摘要
Multi-Agent Reinforcement Learning (MARL) promises to address the challenges of cooperation and competition among multiple agents, often involving safety-critical scenarios. However, realizing safe MARL remains a domain of limited progress. Current works extend single-agent safe learning approaches, employing shielding or backup policies to ensure safety satisfaction. Nevertheless, these approaches require good cooperation among multiple agents, and weakly distributed approaches with centralized shielding become infeasible when agents encounter complex situations such as non-cooperative agents and coordination failures. In this paper, we integrate the Hamilton-Jacobi (HJ) reachability theory and present a Centralized Training and Decentralized Execution (CTDE) framework for Safe MARL. Our framework enables the learning of safety policies without the need for system model or shielding layer pre-training. Additionally, we enhance adaptability to varying levels of cooperation through a conservative approximation estimation of the value function. Experimental results validate the efficacy of our proposed method, demonstrating its ability to ensure safety while successfully achieving target tasks under cooperative conditions. Furthermore, our approach exhibits robustness in the face of non-cooperative behaviors induced by complex disturbance factors.
引用
收藏
相关论文
共 50 条
  • [31] Multi-Agent Image Classification via Reinforcement Learning
    Mousavi, Hossein K.
    Nazari, Mohammadreza
    Takac, Martin
    Motee, Nader
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 5020 - 5027
  • [32] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [33] Approximate solution of Hamilton-Jacobi inequality by neural networks
    Yang, XF
    Shen, TL
    Tamura, K
    APPLIED MATHEMATICS AND COMPUTATION, 1997, 84 (01) : 49 - 64
  • [34] Approximate Solutions to the Hamilton-Jacobi Equations for Generating Functions
    Zhiwei Hao
    Kenji Fujimoto
    Qiuhua Zhang
    Journal of Systems Science and Complexity, 2020, 33 : 261 - 288
  • [35] Approximate Solutions to the Hamilton-Jacobi Equations for Generating Functions
    Hao, Zhiwei
    Fujimoto, Kenji
    Zhang, Qiuhua
    JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2020, 33 (02) : 261 - 288
  • [36] A new approach to the approximate solutions of Hamilton-Jacobi equations
    Osaka Prefecture University, 1-1 Gakuen, Sakai, Osaka 599-8531, Japan
    不详
    不详
    不详
    World Acad. Sci. Eng. Technol., (313-316):
  • [37] A new approach to the approximate solutions of hamilton-jacobi equations
    Imae, Joe
    Shinagawa, Kenjiro
    Kobayashi, Tomoaki
    Zhai, Guisheng
    World Academy of Science, Engineering and Technology, 2011, 51 : 313 - 316
  • [38] Approximate Solutions to the Hamilton-Jacobi Equations for Generating Functions
    HAO Zhiwei
    FUJIMOTO Kenji
    ZHANG Qiuhua
    Journal of Systems Science & Complexity, 2020, 33 (02) : 261 - 288
  • [39] APPROXIMATE LORENTZ-INVARIANCE OF HAMILTON-JACOBI EQUATION
    GAIDA, RP
    IZVESTIYA VYSSHIKH UCHEBNYKH ZAVEDENII FIZIKA, 1975, (12): : 23 - 28
  • [40] Barrier function based safe reinforcement learning for multi-agent systems
    Yao, Ying
    Zhang, Dianfeng
    Wu, Zhaojing
    Shao, Guangru
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 1714 - 1721