Task-Agnostic Safety for Reinforcement Learning

被引:1
|
作者
Rahman, Md Asifur [1 ]
Alqahtani, Sarra [1 ]
机构
[1] Wake Forest Univ, Winston Salem, NC 27101 USA
基金
美国国家科学基金会;
关键词
Reinforcement Learning; safety; attacks; robustness;
D O I
10.1145/3605764.3623913
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) has been an attractive potential for designing autonomous systems due to its learning-by-exploration approach. However, this learning process makes RL inherently vulnerable and thus unsuitable for applications where safety is a top priority. To address this issue, researchers have either jointly optimized task and safety or imposed constraints to restrict exploration. This paper takes a different approach, by utilizing exploration as an adaptive means to learn a robust and safe behavior. To this end, we propose Task-Agnostic Safety for Reinforcement Learning (TAS-RL) framework to ensure safety in RL by learning a representation of unsafe behaviors and excluding them from the agent's policy. TAS-RL is task-agnostic and can be integrated with any RL task policy in the same environment, providing a self-protection layer for the system. To evaluate the robustness of TAS-RL, we present a novel study where TAS-RL and 7 safe RL baselines are tested in constrained Markov decision processes (CMDP) environments under white-box action space perturbations and changes in the environment dynamics. The results show that TAS-RL outperforms all baselines by achieving consistent near-zero safety constraint violations in continuous action space with 10 times more variations in the testing environment dynamics.
引用
收藏
页码:139 / 148
页数:10
相关论文
共 50 条
  • [1] Task-agnostic Exploration in Reinforcement Learning
    Zhang, Xuezhou
    Ma, Yuzhe
    Singla, Adish
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [2] Latent Plans for Task-Agnostic Offline Reinforcement Learning
    Rosete-Beas, Erick
    Mees, Oier
    Kalweit, Gabriel
    Boedecker, Joschka
    Burgard, Wolfram
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1838 - 1849
  • [3] Task-Agnostic Dynamics Priors for Deep Reinforcement Learning
    Du, Yilun
    Narasimhan, Karthik
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [4] Continual deep reinforcement learning with task-agnostic policy distillation
    Hafez, Muhammad Burhan
    Erekmen, Kerim
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [5] Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation
    Hafez, Muhammad Burhan
    Erekmen, Kerim
    arXiv,
  • [6] A Task-Agnostic Regularizer for Diverse Subpolicy Discovery in Hierarchical Reinforcement Learning
    Huo, Liangyu
    Wang, Zulin
    Xu, Mai
    Song, Yuhang
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (03): : 1932 - 1944
  • [7] Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes
    Xu, Mengdi
    Ding, Wenhao
    Zhu, Jiacheng
    Liu, Zuxin
    Chen, Baiming
    Zhao, Ding
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [8] TASK-AGNOSTIC CONTINUAL REINFORCEMENT LEARNING: GAINING INSIGHTS AND OVERCOMING CHALLENGES
    Caccia, Massimo
    Mueller, Jonas
    Kim, Taesup
    Charlin, Laurent
    Fakoor, Rasool
    CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 89 - 119
  • [9] Loss Decoupling for Task-Agnostic Continual Learning
    Liang, Yan-Shuo
    Li, Wu-Jun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Hierarchically structured task-agnostic continual learning
    Heinke Hihn
    Daniel A. Braun
    Machine Learning, 2023, 112 : 655 - 686