Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models

被引:0
|
作者
Zhao, Weiye [1 ]
He, Tairan [1 ]
Liu, Changliu [1 ]
机构
[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
Safe control; Gaussian process; Dynamics learning;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Safety is one of the biggest concerns to applying reinforcement learning (RL) to the physical world. In its core part, it is challenging to ensure RL agents persistently satisfy a hard state constraint without white-box or black-box dynamics models. This paper presents an integrated model learning and safe control framework to safeguard any RL agent, where the environment dynamics are learned as Gaussian processes. The proposed theory provides (i) a novel method to construct an offline dataset for model learning that best achieves safety requirements; (ii) a design rule to construct the safety index to ensure the existence of safe control under control limits; (iii) a probablistic safety guarantee (i.e. probabilistic forward invariance) when the model is learned using the aforementioned dataset. Simulation results show that our framework achieves almost zero safety violation on various continuous control tasks.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Probabilistic Guided Exploration for Reinforcement Learning in Self-Organizing Neural Networks
    Wang, Peng
    Zhou, Weigui Jair
    Wang, Di
    Tan, Ah-Hwee
    2018 IEEE INTERNATIONAL CONFERENCE ON AGENTS (ICA), 2018, : 109 - 112
  • [32] Epoch-Evolving Gaussian Process Guided Learning for Classification
    Cui, Jiabao
    Li, Xuewei
    Zhao, Hanbin
    Wang, Hui
    Li, Bin
    Li, Xi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (01) : 326 - 337
  • [33] Probabilistic forecasting of the disturbance storm time index: An autoregressive Gaussian process approach
    Chandorkar, M.
    Camporeale, E.
    Wing, S.
    SPACE WEATHER-THE INTERNATIONAL JOURNAL OF RESEARCH AND APPLICATIONS, 2017, 15 (08): : 1004 - 1019
  • [34] A knowledge-guided process planning approach with reinforcement learning
    Zhang, Lijun
    Wu, Hongjin
    Chen, Yelin
    Wang, Xuesong
    Peng, Yibing
    JOURNAL OF ENGINEERING DESIGN, 2024,
  • [35] Interactive Multi-objective Reinforcement Learning in Multi-armed Bandits with Gaussian Process Utility Models
    Roijers, Diederik M.
    Zintgraf, Luisa M.
    Libin, Pieter
    Reymond, Mathieu
    Bargiacchi, Eugenio
    Nowe, Ann
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT III, 2021, 12459 : 463 - 478
  • [36] Probabilistic wind power forecasting using selective ensemble of finite mixture Gaussian process regression models
    Jin, Huaiping
    Shi, Lixian
    Chen, Xiangguang
    Qian, Bin
    Yang, Biao
    Jin, Huaikang
    RENEWABLE ENERGY, 2021, 174 : 1 - 18
  • [37] Hybrid Deep Learning Gaussian Process for Deterministic and Probabilistic Load Forecasting
    Zhao, Pengfei
    Zhang, Zhenyuan
    Chen, Haoran
    Wang, Peng
    2021 IEEE IAS INDUSTRIAL AND COMMERCIAL POWER SYSTEM ASIA (IEEE I&CPS ASIA 2021), 2021, : 456 - 463
  • [38] Gaussian Process Learning-Based Probabilistic Optimal Power Flow
    Pareek, Parikshit
    Nguyen, Hung D.
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2021, 36 (01) : 541 - 544
  • [39] Formal Synthesis of Safety Controllers for Unknown Systems Using Gaussian Process Transfer Learning
    Awan, Asad Ullah
    Zamani, Majid
    IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 3741 - 3746
  • [40] Reinforcement Learning with Probabilistic Boolean Network Models of Smart Grid Devices
    Rivera Torres, Pedro Juan
    Gershenson Garcia, Carlos
    Sanchez Puig, Maria Fernanda
    Kanaan Izquierdo, Samir
    COMPLEXITY, 2022, 2022