Absorbing Markov decision processes

被引:2
|
作者
Dufour, Francois [1 ,2 ,3 ]
Prieto-Rumeau, Tomas [4 ]
机构
[1] Univ Bordeaux, Inst Polytech Bordeaux, Bordeaux, France
[2] Univ Bordeaux, Team ASTRAL, INRIA Bordeaux Sud Ouest, Bordeaux, France
[3] Univ Bordeaux, Inst Math Bordeaux, Bordeaux, France
[4] UNED, Stat Dept, Madrid, Spain
关键词
Markov decision processes; absorbing model; occupation measures; characteristic equation; phantom measures; compactness of the set of occupation measures; EQUILIBRIA; POLICIES;
D O I
10.1051/cocv/2024002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we study discrete-time absorbing Markov Decision Processes (MDP) with measurable state space and Borel action space with a given initial distribution. For such models, solutions to the characteristic equation that are not occupation measures may exist. Several necessary and sufficient conditions are provided to guarantee that any solution to the characteristic equation is an occupation measure. Under the so-called continuity-compactness conditions, we first show that a measure is precisely an occupation measure if and only if it satisfies the characteristic equation and an additional absolute continuity condition. Secondly, it is shown that the set of occupation measures is compact in the weak-strong topology if and only if the model is uniformly absorbing. Several examples are provided to illustrate our results.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Reachability in recursive Markov decision processes
    Brazdil, Tomas
    Brozek, Vaclav
    Forejt, Vojtech
    Kucera, Antonin
    [J]. CONCUR 2006 - CONCURRENCY THEORY, PROCEEDINGS, 2006, 4137 : 358 - 374
  • [42] Probabilistic opacity for Markov decision processes
    Berard, Beatrice
    Chatterjee, Krishnendu
    Sznajder, Nathalie
    [J]. INFORMATION PROCESSING LETTERS, 2015, 115 (01) : 52 - 59
  • [43] Preference Planning for Markov Decision Processes
    Li, Meilun
    She, Zhikun
    Turrini, Andrea
    Zhang, Lijun
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3313 - 3319
  • [44] On Markov policies for minimax decision processes
    Iwamoto, S
    Tsurusaki, K
    [J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2001, 253 (01) : 58 - 78
  • [45] Active Exploration in Markov Decision Processes
    Tarbouriech, Jean
    Lazaric, Alessandro
    [J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [46] Markov Decision Processes with Applications to Finance
    McAuliffe, Jon
    [J]. QUANTITATIVE FINANCE, 2012, 12 (01) : 15 - 16
  • [47] Probabilistic Hyperproperties of Markov Decision Processes
    Dimitrova, Rayna
    Finkbeiner, Bernd
    Torfah, Hazem
    [J]. AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS (ATVA 2020), 2020, 12302 : 484 - 500
  • [48] Reachability in recursive Markov decision processes
    Brazdil, Tomas
    Brozek, Vaclav
    Forejt, Vojtech
    Kucera, Antonin
    [J]. INFORMATION AND COMPUTATION, 2008, 206 (05) : 520 - 537
  • [49] Multitime scale Markov decision processes
    Chang, HS
    Fard, PJ
    Marcus, SI
    Shayman, M
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2003, 48 (06) : 976 - 987
  • [50] Entropic Regularization of Markov Decision Processes
    Belousov, Boris
    Peters, Jan
    [J]. ENTROPY, 2019, 21 (07)