Uniqueness and stability of optimal policies of finite state Markov decision processes

被引:2
|
作者
Leizarowitz, Arie [1 ]
Zaslavski, Alexander J. [1 ]
机构
[1] Technion Israel Inst Technol, Dept Math, IL-32000 Haifa, Israel
关键词
Markov decision process; single optimality equation; stability of Markov policies; genericity; overtaking ptimality;
D O I
10.1287/moor.1060.0232
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
In this paper we consider infinite horizon discrete-time optimal control of Markov decision processes (MDPs) with finite state spaces and compact action sets. We restrict attention to unicost MDPs, which form a class that contains all the weakly communicating MDPs. The unicost MDPs are characterized as those MDPs for which there exists a solution to the single optimality equation. We address the problem of uniqueness and stability of minimizing Markov actions. Our main result asserts that when we endow the set of unicost MDPs with a certain natural metric, under which it is complete, then the class of MDPs with essentially unique and stable minimizing Markov actions contains the intersection of countably many open dense sets (hence is itself dense). Thus, the property of having essentially unique and stable minimizing Markov actions is generic for unicost MDPs.
引用
收藏
页码:156 / 167
页数:12
相关论文
共 50 条
  • [1] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
    Cruz-Suárez, D
    Montes-de-Oca, R
    Salem-Silva, F
    [J]. MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2004, 60 (03) : 415 - 436
  • [2] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
    Daniel Cruz-Suárez
    Raúl Montes-de-Oca
    Francisco Salem-Silva
    [J]. Mathematical Methods of Operations Research, 2004, 60 : 415 - 436
  • [3] Nonuniqueness versus Uniqueness of Optimal Policies in Convex Discounted Markov Decision Processes
    Montes-de-Oca, Raul
    Lemus-Rodriguez, Enrique
    Sergio Salem-Silva, Francisco
    [J]. JOURNAL OF APPLIED MATHEMATICS, 2013,
  • [4] Optimal Decision Tree Policies for Markov Decision Processes
    Vos, Daniel
    Verwer, Sicco
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5457 - 5465
  • [5] Optimal adaptive policies for Markov decision processes
    Burnetas, AN
    Katehakis, MN
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 1997, 22 (01) : 222 - 255
  • [6] Optimal Policies for Quantum Markov Decision Processes
    Ming-Sheng Ying
    Yuan Feng
    Sheng-Gang Ying
    [J]. International Journal of Automation and Computing, 2021, 18 : 410 - 421
  • [7] Optimal Policies for Quantum Markov Decision Processes
    Ying, Ming-Sheng
    Feng, Yuan
    Ying, Sheng-Gang
    [J]. INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2021, 18 (03) : 410 - 421
  • [8] IDENTIFICATION OF OPTIMAL POLICIES IN MARKOV DECISION PROCESSES
    Sladky, Karel
    [J]. KYBERNETIKA, 2010, 46 (03) : 558 - 570
  • [9] Optimal Policies for Quantum Markov Decision Processes
    Ming-Sheng Ying
    Yuan Feng
    Sheng-Gang Ying
    [J]. Machine Intelligence Research, 2021, 18 (03) : 410 - 421
  • [10] MONOTONE OPTIMAL POLICIES FOR MARKOV DECISION-PROCESSES
    SERFOZO, RF
    [J]. MATHEMATICAL PROGRAMMING STUDY, 1976, 6 (DEC): : 202 - 215