Uniqueness and stability of optimal policies of finite state Markov decision processes

被引：2

作者：

Leizarowitz, Arie ^{[1
]}

Zaslavski, Alexander J. ^{[1
]}

机构：

[1] Technion Israel Inst Technol, Dept Math, IL-32000 Haifa, Israel

来源：

MATHEMATICS OF OPERATIONS RESEARCH | 2007年 / 32卷 / 01期

关键词：

Markov decision process; single optimality equation; stability of Markov policies; genericity; overtaking ptimality;

D O I：

10.1287/moor.1060.0232

中图分类号：

C93 [管理学]; O22 [运筹学];

学科分类号：

070105 ; 12 ; 1201 ; 1202 ; 120202 ;

摘要：

In this paper we consider infinite horizon discrete-time optimal control of Markov decision processes (MDPs) with finite state spaces and compact action sets. We restrict attention to unicost MDPs, which form a class that contains all the weakly communicating MDPs. The unicost MDPs are characterized as those MDPs for which there exists a solution to the single optimality equation. We address the problem of uniqueness and stability of minimizing Markov actions. Our main result asserts that when we endow the set of unicost MDPs with a certain natural metric, under which it is complete, then the class of MDPs with essentially unique and stable minimizing Markov actions contains the intersection of countably many open dense sets (hence is itself dense). Thus, the property of having essentially unique and stable minimizing Markov actions is generic for unicost MDPs.

引用

页码：156 / 167

页数：12

共 50 条

[31] Algorithm to identify and compute average optimal policies in multichain Markov decision processes
Leizarowitz, A
[J]. MATHEMATICS OF OPERATIONS RESEARCH, 2003, 28 (03) : 553 - 586
[32] THE DETERMINATION OF APPROXIMATELY OPTIMAL POLICIES IN MARKOV DECISION-PROCESSES BY THE USE OF BOUNDS
WHITE, DJ
[J]. JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 1982, 33 (03) : 253 - 259
[33] MONOTONE OPTIMAL POLICIES IN DISCOUNTED MARKOV DECISION PROCESSES WITH TRANSITION PROBABILITIES INDEPENDENT OF THE CURRENT STATE: EXISTENCE AND APPROXIMATION
Flores-Hernandez, Rosa M.
[J]. KYBERNETIKA, 2013, 49 (05) : 705 - 719
[34] Computing optimal stationary policies for multi-objective Markov decision processes
Wiering, Marco A.
de Jong, Edwin D.
[J]. 2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 158 - +
[35] Value Iteration and Action ε-Approximation of Optimal Policies in Discounted Markov Decision Processes
Montes-De-Oca, Raul
Lemus-Rodriguez, Enrique
[J]. RECENT ADVANCES IN APPLIED MATHEMATICS, 2009, : 213 - +
[36] Markov decision processes based optimal control policies for probabilistic boolean networks
Abul, O
Alhajj, R
Polat, F
[J]. BIBE 2004: FOURTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2004, : 337 - 344
[37] Finite State Approximations of Markov Decision Processes with General State and Action Spaces
Saldi, Naci
Linder, Tamas
Yueksel, Serdar
[J]. 2015 AMERICAN CONTROL CONFERENCE (ACC), 2015, : 3589 - 3594
[38] Risk-optimization in finite state markov decision processes with a constraint
Hou, Pingjun
[J]. JOURNAL OF STATISTICS & MANAGEMENT SYSTEMS, 2012, 15 (2-3): : 309 - 322
[39] CONNECTEDNESS CONDITIONS USED IN FINITE STATE MARKOV DECISION-PROCESSES
THOMAS, LC
[J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1979, 68 (02) : 548 - 556
[40] Metrics for finite Markov decision processes
Ferns, N
Panangaden, P
Precup, D
[J]. PROCEEDING OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE SIXTEENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2004, : 950 - 951

← 1 2 3 4 5 →