Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning

被引:0
|
作者
Fulton, Nathan [1 ]
Platzer, Andre [1 ]
机构
[1] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
MARKOV DECISION-PROCESSES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Formal verification provides a high degree of confidence in safe system operation, but only if reality matches the verified model. Although a good model will be accurate most of the time, even the best models are incomplete. This is especially true in Cyber-Physical Systems because high-fidelity physical models of systems are expensive to develop and often intractable to verify. Conversely, reinforcement learning-based controllers are lauded for their flexibility in unmodeled environments, but do not provide guarantees of safe operation. This paper presents an approach for provably safe learning that provides the best of both worlds: the exploration and optimization capabilities of learning along with the safety guarantees of formal verification. Our main insight is that formal verification combined with verified runtime monitoring can ensure the safety of a learning agent. Verification results are preserved whenever learning agents limit exploration within the confounds of verified control choices as long as observed reality comports with the model used for off-line verification. When a model violation is detected, the agent abandons efficiency and instead attempts to learn a control strategy that guides the agent to a modeled portion of the state space. We prove that our approach toward incorporating knowledge about safe control into learning systems preserves safety guarantees, and demonstrate that we retain the empirical performance benefits provided by reinforcement learning. We also explore various points in the design space for these justified speculative controllers in a simple model of adaptive cruise control model for autonomous cars.
引用
下载
收藏
页码:6485 / 6492
页数:8
相关论文
共 50 条
  • [1] Formal Methods Assisted Training of Safe Reinforcement Learning Agents
    Murugesan, Anitha
    Moghadamfalahi, Mohammad
    Chattopadhyay, Arunabh
    NASA FORMAL METHODS (NFM 2019), 2019, 11460 : 333 - 340
  • [2] Safe Reinforcement Learning for CPSs via Formal Modeling and Verification
    Yang, Chenchen
    Liu, Jing
    Sun, Haiying
    Sun, Junfeng
    Chen, Xiang
    Zhang, Lipeng
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [3] On Normative Reinforcement Learning via Safe Reinforcement Learning
    Neufeld, Emery A.
    Bartocci, Ezio
    Ciabattoni, Agata
    PRIMA 2022: PRINCIPLES AND PRACTICE OF MULTI-AGENT SYSTEMS, 2023, 13753 : 72 - 89
  • [4] Safe HVAC Control via Batch Reinforcement Learning
    Liu, Hsin-Yu
    Balaji, Bharathan
    Gao, Sicun
    Gupta, Rajesh
    Hong, Dezhi
    2022 13TH ACM/IEEE INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL SYSTEMS (ICCPS 2022), 2022, : 181 - 192
  • [5] Safe Reinforcement Learning via Shielding
    Alshiekh, Mohammed
    Bloem, Roderick
    Ehlers, Ruediger
    Koenighofer, Bettina
    Niekum, Scott
    Topcu, Ufuk
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 2669 - 2678
  • [6] Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning
    Brunke, Lukas
    Greeff, Melissa
    Hall, Adam W.
    Yuan, Zhaocong
    Zhou, Siqi
    Panerati, Jacopo
    Schoellig, Angela P.
    ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 5 : 411 - 444
  • [7] Safe Building HVAC Control via Batch Reinforcement Learning
    Zhang, Chi
    Kuppannagari, Sanmukh Rao
    Prasanna, Viktor K.
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2022, 7 (04): : 923 - 934
  • [8] Safe Policies for Reinforcement Learning via Primal-Dual Methods
    Paternain, Santiago
    Calvo-Fullana, Miguel
    Chamon, Luiz F. O.
    Ribeiro, Alejandro
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (03) : 1321 - 1336
  • [9] Safe Reinforcement Learning via Curriculum Induction
    Turchetta, Matteo
    Kolobov, Andrey
    Shah, Shital
    Krause, Andreas
    Agarwal, Alekh
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [10] Shielded Reinforcement Learning: A review of reactive methods for safe learning
    Odriozola-Olalde, Haritz
    Zamalloa, Maider
    Arana-Arexolaleiba, Nestor
    2023 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION, SII, 2023,