MONOTONE OPTIMAL POLICIES IN DISCOUNTED MARKOV DECISION PROCESSES WITH TRANSITION PROBABILITIES INDEPENDENT OF THE CURRENT STATE: EXISTENCE AND APPROXIMATION

被引：0

作者：

Flores-Hernandez, Rosa M. ^{[1
]}

机构：

[1] Univ Autonoma Tlaxcala, Fac Ciencias Basicas Ingn & Tecnol, Apizaco 90300, Tlaxcala, Mexico

来源：

KYBERNETIKA | 2013年 / 49卷 / 05期

关键词：

Markov decision process; total discounted cost; total discounted reward; increasing optimal policy; decreasing optimal policy; policy iteration algorithm; MODELS; OPTIMIZATION;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper there are considered Markov decision processes (MDPs) that have the discounted cost as the objective function, state and decision spaces that are subsets of the real line but are not necessarily finite or denumerable. The considered MDPs have a cost function that is possibly unbounded, and dynamic independent of the current state. The considered decision sets are possibly non-compact. In the context described, conditions to obtain either an increasing or decreasing optimal stationary policy are provided; these conditions do not require assumptions of convexity. Versions of the policy iteration algorithm (PIA) to approximate increasing or decreasing optimal stationary policies are detailed. An illustrative example is presented. Finally, comments on the monotonicity conditions and the monotone versions of the PIA that are applied to discounted MDPs with rewards are given.

引用

下载

页码：705 / 719

页数：15

共 50 条

[31] Approximation of average cost optimal policies for general Markov decision processes with unbounded costs
Evgueni Gordienko
Raúl Montes-De-Oca
Adolfo Minjárez-Sosa
Mathematical Methods of Operations Research, 1997, 45 : 245 - 263
[32] UNIQUENESS OF OPTIMAL POLICIES AS A GENERIC PROPERTY OF DISCOUNTED MARKOV DECISION PROCESSES: EKELAND'S VARIATIONAL PRINCIPLE APPROACH
Israel Ortega-Gutierrez, R.
Montes-de-Oca, Raul
Lemus-Rodriguez, Enrique
KYBERNETIKA, 2016, 52 (01) : 66 - 75
[33] A NEW CONDITION FOR THE EXISTENCE OF OPTIMAL STATIONARY POLICIES IN AVERAGE COST MARKOV DECISION-PROCESSES
SENNOTT, LI
OPERATIONS RESEARCH LETTERS, 1986, 5 (01) : 17 - 23
[34] CONDITIONS FOR EXISTENCE OF AVERAGE AND BLACKWELL OPTIMAL STATIONARY POLICIES IN DENUMERABLE MARKOV DECISION-PROCESSES
LASSERRE, JB
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1988, 136 (02) : 479 - 489
[35] Simplex Algorithm for Countable-State Discounted Markov Decision Processes
Lee, Ilbin
Epelman, Marina A.
Romeijn, H. Edwin
Smith, Robert L.
OPERATIONS RESEARCH, 2017, 65 (04) : 1029 - 1042
[36] ON THE EXISTENCE OF STATIONARY OPTIMAL POLICIES IN MARKOV DECISION-MODELS
VANDAWEN, R
SCHAL, M
ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1983, 63 (05): : T403 - T404
[37] Singular Control for Discounted Markov Decision Processes in a General State Space
Costa, O. L. V.
Dufour, F.
2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 7087 - 7092
[38] MARKOV DECISION-PROCESSES WITH IMPRECISE TRANSITION-PROBABILITIES
WHITE, CC
ELDEIB, HK
OPERATIONS RESEARCH, 1994, 42 (04) : 739 - 749
[39] Loss Bounds for Uncertain Transition Probabilities in Markov Decision Processes
Mastin, Andrew
Jaillet, Patrick
2012 IEEE 51ST ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2012, : 6708 - 6715
[40] A fuzzy approach to Markov decision processes with uncertain transition probabilities
Kurano, M.
Yasuda, M.
Nakagami, J.
Yoshida, Y.
FUZZY SETS AND SYSTEMS, 2006, 157 (19) : 2674 - 2682

← 1 2 3 4 5 →