Task-Agnostic Safety for Reinforcement Learning

被引:1
|
作者
Rahman, Md Asifur [1 ]
Alqahtani, Sarra [1 ]
机构
[1] Wake Forest Univ, Winston Salem, NC 27101 USA
基金
美国国家科学基金会;
关键词
Reinforcement Learning; safety; attacks; robustness;
D O I
10.1145/3605764.3623913
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) has been an attractive potential for designing autonomous systems due to its learning-by-exploration approach. However, this learning process makes RL inherently vulnerable and thus unsuitable for applications where safety is a top priority. To address this issue, researchers have either jointly optimized task and safety or imposed constraints to restrict exploration. This paper takes a different approach, by utilizing exploration as an adaptive means to learn a robust and safe behavior. To this end, we propose Task-Agnostic Safety for Reinforcement Learning (TAS-RL) framework to ensure safety in RL by learning a representation of unsafe behaviors and excluding them from the agent's policy. TAS-RL is task-agnostic and can be integrated with any RL task policy in the same environment, providing a self-protection layer for the system. To evaluate the robustness of TAS-RL, we present a novel study where TAS-RL and 7 safe RL baselines are tested in constrained Markov decision processes (CMDP) environments under white-box action space perturbations and changes in the environment dynamics. The results show that TAS-RL outperforms all baselines by achieving consistent near-zero safety constraint violations in continuous action space with 10 times more variations in the testing environment dynamics.
引用
收藏
页码:139 / 148
页数:10
相关论文
共 50 条
  • [21] Learning from History: Task-agnostic Model Contrastive Learning for Image Restoration
    Wu, Gang
    Jiang, Junjun
    Jiang, Kui
    Liu, Xianming
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5976 - 5984
  • [22] Diffused Task-Agnostic Milestone Planner
    Hong, Mineui
    Kang, Minjae
    Oh, Songhwai
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [23] TUSK: Task-Agnostic Unsupervised Keypoints
    Jin, Yuhe
    Sun, Weiwei
    Hosang, Jan
    Trulls, Eduard
    Yi, Kwang Moo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [24] Task-agnostic representation learning of multimodal twitter data for downstream applications
    Ryan Rivas
    Sudipta Paul
    Vagelis Hristidis
    Evangelos E. Papalexakis
    Amit K. Roy-Chowdhury
    Journal of Big Data, 9
  • [25] TASK-AGNOSTIC CONTINUAL LEARNING USING BASE-CHILD CLASSIFIERS
    Singh, Pranshu Ranjan
    Gopalakrishnan, Saisubramaniam
    Qiao ZhongZheng
    Suganthan, Ponnuthurai N.
    Ramasamy, Savitha
    Ambikapathi, ArulMurugan
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 794 - 798
  • [26] Towards a Task-Agnostic Model of Difficulty Estimation for Supervised Learning Tasks
    Laverghetta, Antonio, Jr.
    Mirzakhalov, Jamshidbek
    Licato, John
    AACL-IJCNLP 2020: THE 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2020, : 16 - 23
  • [27] Task-agnostic self-modeling machines
    Kwiatkowski, Robert
    Lipson, Hod
    SCIENCE ROBOTICS, 2019, 4 (26)
  • [28] Task-agnostic representation learning of multimodal twitter data for downstream applications
    Rivas, Ryan
    Paul, Sudipta
    Hristidis, Vagelis
    Papalexakis, Evangelos E.
    Roy-Chowdhury, Amit K.
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [29] TADA: Task-Agnostic Dialect Adapters for English
    Held, William
    Ziems, Caleb
    Yang, Diyi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 813 - 824
  • [30] Mimic and Fool: A Task-Agnostic Adversarial Attack
    Chaturvedi, Akshay
    Garain, Utpal
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (04) : 1801 - 1808