Learning and optimal control of imprecise Markov decision processes by dynamic programming using the imprecise Dirichlet model

被引:0
|
作者
Troffaes, MCM [1 ]
机构
[1] State Univ Ghent, SYSTeMS Res Grp, B-9052 Zwijnaarde, Belgium
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate the conditions under which dynamic programming yields a solution to simultaneous learning and optimal control of a Markov decision process. First, we introduce a new optimality criterion that allows act-state dependence. This criterion is based on a partial preference ordering induced by an imprecise probability model of the dynamics of the system, updated by observations of the state and control history of the system. Then, we show that dynamic programming yields the set of all optimal solutions if the imprecise probability model satisfies particular properties. When we model learning of the system dynamics by an imprecise Dirichlet model, these properties turn out to be satisfied.
引用
收藏
页码:141 / 148
页数:8
相关论文
共 50 条
  • [21] Near-optimal control for a stochastic SIRS model with imprecise parameters
    Mu, Xiaojie
    Zhang, Qimin
    Rong, Libin
    [J]. ASIAN JOURNAL OF CONTROL, 2020, 22 (05) : 2090 - 2105
  • [22] Limits of Learning from Imperfect Observations under Prior Ignorance: the Case of the Imprecise Dirichlet Model
    Piatti, Alberto
    Zaffalon, Marco
    Trojani, Fabio
    [J]. ISIPTA 05-PROCEEDINGS OF THE FOURTH INTERNATIONAL SYMPOSIUM ON IMPRECISE PROBABILITIES AND THEIR APPLICATIONS, 2005, : 276 - 286
  • [24] Imprecise parameters for near-optimal control of stochastic SIV epidemic model
    Wang, Zong
    Zhang, Qimin
    Meyer-Baese, Anke
    [J]. MATHEMATICAL METHODS IN THE APPLIED SCIENCES, 2020, 43 (05) : 2301 - 2321
  • [25] Theoretical Analysis of an Imprecise Prey-Predator Model with Harvesting and Optimal Control
    Das, Anjana
    Pal, M.
    [J]. JOURNAL OF OPTIMIZATION, 2019, 2019
  • [26] Towards Dynamic Pricing for Shared Mobility on Demand using Markov Decision Processes and Dynamic Programming
    Guan, Yue
    Annaswamy, Anuradha M.
    Tseng, H. Eric
    [J]. 2020 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2020,
  • [27] Risk-averse dynamic programming for Markov decision processes
    Ruszczynski, Andrzej
    [J]. MATHEMATICAL PROGRAMMING, 2010, 125 (02) : 235 - 261
  • [28] Risk-averse dynamic programming for Markov decision processes
    Andrzej Ruszczyński
    [J]. Mathematical Programming, 2010, 125 : 235 - 261
  • [29] Monotone optimal control for a class of Markov decision processes
    Zhuang, Weifen
    Li, Michael Z. F.
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 217 (02) : 342 - 350
  • [30] Optimal control in light traffic Markov decision processes
    Ger Koole
    Olaf Passchier
    [J]. Mathematical Methods of Operations Research, 1997, 45 : 63 - 79