MONOTONE OPTIMAL POLICIES IN DISCOUNTED MARKOV DECISION PROCESSES WITH TRANSITION PROBABILITIES INDEPENDENT OF THE CURRENT STATE: EXISTENCE AND APPROXIMATION

被引:0
|
作者
Flores-Hernandez, Rosa M. [1 ]
机构
[1] Univ Autonoma Tlaxcala, Fac Ciencias Basicas Ingn & Tecnol, Apizaco 90300, Tlaxcala, Mexico
关键词
Markov decision process; total discounted cost; total discounted reward; increasing optimal policy; decreasing optimal policy; policy iteration algorithm; MODELS; OPTIMIZATION;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper there are considered Markov decision processes (MDPs) that have the discounted cost as the objective function, state and decision spaces that are subsets of the real line but are not necessarily finite or denumerable. The considered MDPs have a cost function that is possibly unbounded, and dynamic independent of the current state. The considered decision sets are possibly non-compact. In the context described, conditions to obtain either an increasing or decreasing optimal stationary policy are provided; these conditions do not require assumptions of convexity. Versions of the policy iteration algorithm (PIA) to approximate increasing or decreasing optimal stationary policies are detailed. An illustrative example is presented. Finally, comments on the monotonicity conditions and the monotone versions of the PIA that are applied to discounted MDPs with rewards are given.
引用
收藏
页码:705 / 719
页数:15
相关论文
共 50 条
  • [1] Value Iteration and Action ε-Approximation of Optimal Policies in Discounted Markov Decision Processes
    Montes-De-Oca, Raul
    Lemus-Rodriguez, Enrique
    [J]. RECENT ADVANCES IN APPLIED MATHEMATICS, 2009, : 213 - +
  • [2] MONOTONE OPTIMAL POLICIES FOR MARKOV DECISION-PROCESSES
    SERFOZO, RF
    [J]. MATHEMATICAL PROGRAMMING STUDY, 1976, 6 (DEC): : 202 - 215
  • [3] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
    Daniel Cruz-Suárez
    Raúl Montes-de-Oca
    Francisco Salem-Silva
    [J]. Mathematical Methods of Operations Research, 2004, 60 : 415 - 436
  • [4] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
    Cruz-Suárez, D
    Montes-de-Oca, R
    Salem-Silva, F
    [J]. MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2004, 60 (03) : 415 - 436
  • [5] EXISTENCE OF OPTIMAL STATIONARY POLICIES IN DISCOUNTED MARKOV DECISION-PROCESSES - APPROACHES BY OCCUPATION MEASURES
    KURANO, M
    KAWAI, M
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1994, 27 (9-10) : 95 - 101
  • [6] Robust analysis of discounted Markov decision processes with uncertain transition probabilities
    LOU Zhen-kai
    HOU Fu-jun
    LOU Xu-ming
    [J]. Applied Mathematics:A Journal of Chinese Universities, 2020, 35 (04) : 417 - 436
  • [7] Robust analysis of discounted Markov decision processes with uncertain transition probabilities
    Zhen-kai Lou
    Fu-jun Hou
    Xu-ming Lou
    [J]. Applied Mathematics-A Journal of Chinese Universities, 2020, 35 : 417 - 436
  • [8] Robust analysis of discounted Markov decision processes with uncertain transition probabilities
    Lou Zhen-kai
    Hou Fu-jun
    Lou Xu-ming
    [J]. APPLIED MATHEMATICS-A JOURNAL OF CHINESE UNIVERSITIES SERIES B, 2020, 35 (04) : 417 - 436
  • [9] NOTE ON MONOTONE OPTIMAL POLICIES FOR MARKOV DECISION-PROCESSES
    KALIN, D
    [J]. MATHEMATICAL PROGRAMMING, 1978, 15 (02) : 220 - 222
  • [10] Nonuniqueness versus Uniqueness of Optimal Policies in Convex Discounted Markov Decision Processes
    Montes-de-Oca, Raul
    Lemus-Rodriguez, Enrique
    Sergio Salem-Silva, Francisco
    [J]. JOURNAL OF APPLIED MATHEMATICS, 2013,