Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning

被引：0

作者：

Fulton, Nathan ^{[1
]}

Platzer, Andre ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA

来源：

THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2018年

基金：

美国国家科学基金会;

关键词：

MARKOV DECISION-PROCESSES;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Formal verification provides a high degree of confidence in safe system operation, but only if reality matches the verified model. Although a good model will be accurate most of the time, even the best models are incomplete. This is especially true in Cyber-Physical Systems because high-fidelity physical models of systems are expensive to develop and often intractable to verify. Conversely, reinforcement learning-based controllers are lauded for their flexibility in unmodeled environments, but do not provide guarantees of safe operation. This paper presents an approach for provably safe learning that provides the best of both worlds: the exploration and optimization capabilities of learning along with the safety guarantees of formal verification. Our main insight is that formal verification combined with verified runtime monitoring can ensure the safety of a learning agent. Verification results are preserved whenever learning agents limit exploration within the confounds of verified control choices as long as observed reality comports with the model used for off-line verification. When a model violation is detected, the agent abandons efficiency and instead attempts to learn a control strategy that guides the agent to a modeled portion of the state space. We prove that our approach toward incorporating knowledge about safe control into learning systems preserves safety guarantees, and demonstrate that we retain the empirical performance benefits provided by reinforcement learning. We also explore various points in the design space for these justified speculative controllers in a simple model of adaptive cruise control model for autonomous cars.

引用

下载

页码：6485 / 6492

页数：8

共 50 条

[1] Formal Methods Assisted Training of Safe Reinforcement Learning Agents
Murugesan, Anitha
Moghadamfalahi, Mohammad
Chattopadhyay, Arunabh
NASA FORMAL METHODS (NFM 2019), 2019, 11460 : 333 - 340
[2] Safe Reinforcement Learning for CPSs via Formal Modeling and Verification
Yang, Chenchen
Liu, Jing
Sun, Haiying
Sun, Junfeng
Chen, Xiang
Zhang, Lipeng
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[3] On Normative Reinforcement Learning via Safe Reinforcement Learning
Neufeld, Emery A.
Bartocci, Ezio
Ciabattoni, Agata
PRIMA 2022: PRINCIPLES AND PRACTICE OF MULTI-AGENT SYSTEMS, 2023, 13753 : 72 - 89
[4] Safe HVAC Control via Batch Reinforcement Learning
Liu, Hsin-Yu
Balaji, Bharathan
Gao, Sicun
Gupta, Rajesh
Hong, Dezhi
2022 13TH ACM/IEEE INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL SYSTEMS (ICCPS 2022), 2022, : 181 - 192
[5] Safe Reinforcement Learning via Shielding
Alshiekh, Mohammed
Bloem, Roderick
Ehlers, Ruediger
Koenighofer, Bettina
Niekum, Scott
Topcu, Ufuk
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 2669 - 2678
[6] Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning
Brunke, Lukas
Greeff, Melissa
Hall, Adam W.
Yuan, Zhaocong
Zhou, Siqi
Panerati, Jacopo
Schoellig, Angela P.
ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 5 : 411 - 444
[7] Safe Building HVAC Control via Batch Reinforcement Learning
Zhang, Chi
Kuppannagari, Sanmukh Rao
Prasanna, Viktor K.
IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2022, 7 (04): : 923 - 934
[8] Safe Policies for Reinforcement Learning via Primal-Dual Methods
Paternain, Santiago
Calvo-Fullana, Miguel
Chamon, Luiz F. O.
Ribeiro, Alejandro
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (03) : 1321 - 1336
[9] Safe Reinforcement Learning via Curriculum Induction
Turchetta, Matteo
Kolobov, Andrey
Shah, Shital
Krause, Andreas
Agarwal, Alekh
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[10] Shielded Reinforcement Learning: A review of reactive methods for safe learning
Odriozola-Olalde, Haritz
Zamalloa, Maider
Arana-Arexolaleiba, Nestor
2023 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION, SII, 2023,

← 1 2 3 4 5 →