Certified policy synthesis for general Markov decision processes: An application in building automation systems

被引:11
|
作者
Haesaert, Sofie [1 ]
Cauchi, Nathalie [2 ]
Abate, Alessandro [2 ]
机构
[1] Tech Univ Eindhoven, Dept Elect Engn, Eindhoven, Netherlands
[2] Univ Oxford, Dept Comp Sci, Wolfson Bldg,Parks Rd, Oxford, England
关键词
Verification; Synthesis; General Markov decision processes; Safety; Building automation systems; Temperature control; MODEL-PREDICTIVE CONTROL; PROBABILITY-MEASURES; ENERGY MANAGEMENT; REDUCTION; EXISTENCE;
D O I
10.1016/j.peva.2017.09.005
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present an industrial application of new approximate similarity relations for Markov models, and show that they are key for the synthesis of control strategies. Typically, modern engineering systems are modelled using complex and high-order models which make the correct-by-design controller construction computationally hard. Using the new approximate similarity relations, this complexity is reduced and we provide certificates on the performance of the synthesised policies. The application deals with stochastic models for the thermal dynamics in a "smart building" setup: such building automation system set-up can be described by discrete-time Markov decision processes evolving over an uncountable state space and endowed with an output quantifying the room temperature. The new similarity relations draw a quantitative connection between different levels of model abstraction, and allow to quantitatively refine over complex models control strategies synthesised on simpler ones. The new relations, underpinned by the use of metrics, allow in particular for a useful trade-off between deviations over probability distributions on states and distances between model outputs. We develop a software toolbox supporting the application and the computational implementation of these new relations. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:75 / 103
页数:29
相关论文
共 50 条
  • [21] Policy Gradient for Rectangular Robust Markov Decision Processes
    Kumar, Navdeep
    Derman, Esther
    Geist, Matthieu
    Levy, Kfir
    Mannor, Shie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [22] Policy iteration for robust nonstationary Markov decision processes
    Saumya Sinha
    Archis Ghate
    Optimization Letters, 2016, 10 : 1613 - 1628
  • [23] Policy iteration for robust nonstationary Markov decision processes
    Sinha, Saumya
    Ghate, Archis
    OPTIMIZATION LETTERS, 2016, 10 (08) : 1613 - 1628
  • [24] The Smoothed Complexity of Policy Iteration for Markov Decision Processes
    Christ, Miranda
    Yannakakis, Mihalis
    PROCEEDINGS OF THE 55TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING, STOC 2023, 2023, : 1890 - 1903
  • [25] Oblivious Markov Decision Processes: Planning and Policy Execution
    Alsayegh, Murtadha
    Fuentes, Jose
    Bobadilla, Leonardo
    Shell, Dylan A.
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 3850 - 3857
  • [26] Policy Iteration for Decentralized Control of Markov Decision Processes
    Bernstein, Daniel S.
    Amato, Christopher
    Hansen, Eric A.
    Zilberstein, Shlomo
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2009, 34 : 89 - 132
  • [27] Approximation of Markov decision processes with general state space
    Dufour, F.
    Prieto-Rumeau, T.
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2012, 388 (02) : 1254 - 1267
  • [28] Synthesis for PCTL in Parametric Markov Decision Processes
    Hahn, Ernst Moritz
    Han, Tingting
    Zhang, Lijun
    NASA FORMAL METHODS, 2011, 6617 : 146 - +
  • [29] A policy gradient method for semi-Markov decision processes with application to call admission control
    Singh, Sumeetpal S.
    Tadic, Vladislav B.
    Doucet, Arnaud
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 178 (03) : 808 - 818
  • [30] Building efficient partial plans using Markov decision processes
    Laroche, P
    12TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2000, : 156 - 163