Constructing MDP Abstractions Using Data With Formal Guarantees

被引:8
|
作者
Lavaei, Abolfazl [1 ]
Soudjani, Sadegh [2 ]
Frazzoli, Emilio [1 ]
Zamani, Majid [3 ,4 ]
机构
[1] Swiss Fed Inst Technol, Inst Dynam Syst & Control, CH-8092 Zurich, Switzerland
[2] Newcastle Univ, Sch Comp, Newcastle Upon Tyne NE4 5TG, Tyne & Wear, England
[3] Univ Colorado, Comp Sci Dept, Boulder, CO 80309 USA
[4] Ludwig Maximilians Univ Munchen, Dept Comp Sci, D-80539 Munich, Germany
来源
基金
瑞士国家科学基金会; 英国工程与自然科学研究理事会;
关键词
Stochastic processes; Trajectory; Control systems; Stochastic systems; Probabilistic logic; Markov processes; Picture archiving and communication systems; Data-driven synthesis; MDP abstractions; stochastic bisimulation functions; formal guarantees; STOCHASTIC-SYSTEMS; VERIFICATION; SAFETY;
D O I
10.1109/LCSYS.2022.3188535
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This letter is concerned with a data-driven technique for constructing finite Markov decision processes (MDPs) as finite abstractions of discrete-time stochastic control systems with unknown dynamics while providing formal closeness guarantees. The proposed scheme is based on notions of stochastic bisimulation functions (SBF) to capture the probabilistic distance between state trajectories of an unknown stochastic system and those of finite MDP. In our proposed setting, we first reformulate corresponding conditions of SBF as a robust convex program (RCP). We then propose a scenario convex program (SCP) associated to the original RCP by collecting a finite number of data from trajectories of the system. We ultimately construct an SBF between the data-driven finite MDP and the unknown stochastic system with a given confidence level by establishing a probabilistic relation between optimal values of the SCP and the RCP. We also propose two different approaches for the construction of finite MDPs from data. We illustrate the efficacy of our results over a nonlinear jet engine compressor with unknown dynamics. We construct a data-driven finite MDP as a suitable substitute of the original system to synthesize controllers maintaining the system in a safe set with some probability of satisfaction and a desirable confidence level.
引用
收藏
页码:460 / 465
页数:6
相关论文
共 50 条
  • [21] Constructing Temporal Abstractions Autonomously in Reinforcement Learning
    Bacon, Pierre-Luc
    Precup, Doina
    AI MAGAZINE, 2018, 39 (01) : 39 - 50
  • [22] Formal Guarantees for Localized Bug Fixes
    Mitra, Srobona
    Banerjee, Ansuman
    Dasgupta, Pallab
    Ghosh, Priyankar
    Kumar, Harish
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2013, 32 (08) : 1274 - 1287
  • [23] Formal Abstractions for Attested Execution Secure Processors
    Pass, Rafael
    Shi, Elaine
    Tramer, Florian
    ADVANCES IN CRYPTOLOGY - EUROCRYPT 2017, PT I, 2017, 10210 : 260 - 289
  • [24] Aggregation of Thermostatically Controlled Loads by Formal Abstractions
    Soudjani, Sadegh Esmaeil Zadeh
    Abate, Alessandro
    2013 EUROPEAN CONTROL CONFERENCE (ECC), 2013, : 4232 - 4237
  • [25] A Formal Semantics for Isorecursive and Equirecursive State Abstractions
    Summers, Alexander J.
    Drossopoulou, Sophia
    ECOOP 2013 - OBJECT-ORIENTED PROGRAMMING, 2013, 7920 : 129 - 153
  • [26] CONSTRUCTING FORMAL OPERATIONS
    SMITH, L
    ADOLESCENT DEVELOPMENT AND SCHOOL SCIENCE, 1989, : 329 - 333
  • [27] PROTECTION IN DATA TYPE ABSTRACTIONS USING CONSTRAINTS ON DATA VALUES.
    Short, K.W.
    1600, (24):
  • [28] USING DATA-BASE ABSTRACTIONS FOR LOGICAL DESIGN
    WELDON, JL
    COMPUTER JOURNAL, 1980, 23 (01): : 41 - 45
  • [29] A MODEL OF COMMUNICATION IN ADA USING SHARED DATA ABSTRACTIONS
    MARLIN, C
    OUDSHOORN, M
    FREIDEL, D
    LECTURE NOTES IN COMPUTER SCIENCE, 1990, 468 : 443 - 452
  • [30] USING DATA-BASE ABSTRACTIONS FOR LOGICAL DESIGN
    REEVES, R
    COMPUTER JOURNAL, 1980, 23 (04): : 381 - 381