Reducing Safety Interventions in Provably Safe Reinforcement Learning

被引:0
|
作者
Thumm, Jakob [1 ]
Pelat, Guillaume [1 ]
Althoff, Matthias [1 ]
机构
[1] Tech Univ Munich, Sch Informat, D-85748 Garching, Germany
基金
欧盟地平线“2020”;
关键词
OPTIMIZATION;
D O I
10.1109/IROS55552.2023.10342464
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Reinforcement Learning (RL) has shown promise in addressing complex robotic challenges. In real-world applications, RL is often accompanied by failsafe controllers as a last resort to avoid catastrophic events. While necessary for safety, these interventions can result in undesirable behaviors, such as abrupt braking or aggressive steering. This paper proposes two safety intervention reduction methods: proactive replacement and proactive projection, which change the action of the agent if it leads to a potential failsafe intervention. These approaches are compared to state-of-the-art constrained RL on the OpenAI safety gym benchmark and a human-robot collaboration task. Our study demonstrates that the combination of our method with provably safe RL leads to high-performing policies with zero safety violations and a low number of failsafe interventions. Our versatile method can be applied to a wide range of real-world robotic tasks, while effectively improving safety without sacrificing task performance.
引用
收藏
页码:7515 / 7522
页数:8
相关论文
共 50 条
  • [41] Provably Efficient Offline Reinforcement Learning With Trajectory-Wise Reward
    Xu, Tengyu
    Wang, Yue
    Zou, Shaofeng
    Liang, Yingbin
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (09) : 6481 - 6518
  • [42] Provably Efficient Multi-Task Reinforcement Learning with Model Transfer
    Zhang, Chicheng
    Wang, Zhi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [43] Provably safe and robust learning-based model predictive control
    Aswani, Anil
    Gonzalez, Humberto
    Sastry, S. Shankar
    Tomlin, Claire
    [J]. AUTOMATICA, 2013, 49 (05) : 1216 - 1226
  • [44] Safety Margins for Reinforcement Learning
    Grushin, Alexander
    Woods, Walt
    Velasquez, Alvaro
    Khan, Simon
    [J]. 2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 42 - 43
  • [45] Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
    Qiu, Shuang
    Wang, Lingxiao
    Bai, Chenjia
    Yang, Zhuoran
    Wang, Zhaoran
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [46] Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games
    Mao, Weichao
    Basar, Tamer
    [J]. DYNAMIC GAMES AND APPLICATIONS, 2023, 13 (01) : 165 - 186
  • [47] REINFORCEMENT LEARNING WITH SAFE EXPLORATION FOR NETWORK SECURITY
    Dai, Canhuang
    Xiao, Liang
    Wan, Xiaoyue
    Chen, Ye
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3057 - 3061
  • [48] Skill Reward for Safe Deep Reinforcement Learning
    Cheng, Jiangchang
    Yu, Fumin
    Zhang, Hongliang
    Dai, Yinglong
    [J]. UBIQUITOUS SECURITY, 2022, 1557 : 203 - 213
  • [49] Safe Reinforcement Learning by Imagining the Near Future
    Thomas, Garrett
    Luo, Yuping
    Ma, Tengyu
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [50] Adversarial Behavior Exclusion for Safe Reinforcement Learning
    Rahman, Md Asifur
    Liu, Tongtong
    Alqahtani, Sarra
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 483 - 491