Distributional Reachability for Markov Decision Processes: Theory and Applications

被引:0
|
作者
Gao, Yulong [1 ]
Abate, Alessandro [2 ]
Xie, Lihua [3 ]
Johansson, Karl Henrik [4 ]
机构
[1] Imperial Coll London, Dept Elect & Elect Engn, London SW7 2AZ, England
[2] Univ Oxford, Dept Comp Sci, Oxford OX1 3QD, England
[3] Nanyang Technol Univ, Sch Elect & Elect Engn, Nanyang 639798, Singapore
[4] KTH Royal Inst Technol & Digital Futures, Div Decis & Control Syst, Stockholm, Sweden
基金
瑞典研究理事会;
关键词
Distributional reachability; Markov decision processes (MDPs); probabilistic reachability; reach-avoid problems; set invariance; ALGORITHM;
D O I
10.1109/TAC.2023.3341282
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study distributional reachability for finite Markov decision processes (MDPs) from a control theoretical perspective. Unlike standard probabilistic reachability notions, which are defined over MDP states or trajectories, in this article reachability is formulated over the space of probability distributions. We propose two set-valued maps for the forward and backward distributional reachability problems: the forward map collects all state distributions that can be reached from a set of initial distributions, while the backward map collects all state distributions that can reach a set of final distributions. We show that there exists a maximal invariant set under the forward map and this set is the region where the state distributions eventually always belong to, regardless of the initial state distribution and policy. The backward map provides an alternative way to solve a class of important problems for MDPs: the study of controlled invariance, the characterization of the domain of attraction, and reach-avoid problems. Three case studies illustrate the effectiveness of our approach.
引用
收藏
页码:4598 / 4613
页数:16
相关论文
共 50 条
  • [1] Reachability in recursive Markov decision processes
    Brazdil, Tomas
    Brozek, Vaclav
    Forejt, Vojtech
    Kucera, Antonin
    [J]. INFORMATION AND COMPUTATION, 2008, 206 (05) : 520 - 537
  • [2] Reachability in recursive Markov decision processes
    Brazdil, Tomas
    Brozek, Vaclav
    Forejt, Vojtech
    Kucera, Antonin
    [J]. CONCUR 2006 - CONCURRENCY THEORY, PROCEEDINGS, 2006, 4137 : 358 - 374
  • [3] The complexity of reachability in parametric Markov decision processes
    Junges, Sebastian
    Katoen, Joost-Pieter
    Perez, Guillermo A.
    Winkler, Tobias
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2021, 119 : 183 - 210
  • [4] Reachability analysis of quantum Markov decision processes
    Ying, Shenggang
    Ying, Mingsheng
    [J]. INFORMATION AND COMPUTATION, 2018, 263 : 31 - 51
  • [5] Functional Reward Markov Decision Processes: Theory and Applications
    Weng, Paul
    Spanjaard, Olivier
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2017, 26 (03)
  • [6] Anytime Guarantees for Reachability in Uncountable Markov Decision Processes
    Technische Universität München, Germany
    不详
    [J]. Leibniz Int. Proc. Informatics, LIPIcs,
  • [7] On a Notion of Resilience for Markov Decision Processes with Reachability Objectives
    Duan, Xiaoming
    Baharisangari, Nasim
    Yan, Rui
    Xu, Zhe
    Ornik, Melkior
    [J]. IFAC PAPERSONLINE, 2023, 56 (02): : 11261 - 11266
  • [8] Reachability and Differential Based Heuristics for Solving Markov Decision Processes
    Debnath, Shoubhik
    Liu, Lantao
    Sukhatme, Gaurav
    [J]. ROBOTICS RESEARCH, 2020, 10 : 387 - 404
  • [9] Multi-weighted Markov Decision Processes with Reachability Objectives
    Bouyer, Patricia
    Gonzalez, Mauricio
    Markey, Nicolas
    Randour, Mickael
    [J]. ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2018, (277): : 250 - 264
  • [10] Reachability and Safety Objectives in Markov Decision Processes on Long but Finite Horizons
    Ashkenazi-Golan, Galit
    Flesch, Janos
    Predtetchinski, Arkadi
    Solan, Eilon
    [J]. JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2020, 185 (03) : 945 - 965