Finite horizon partially observable semi-Markov decision processes under risk probability criteria

被引:0
|
作者
Wen, Xin [1 ]
Guo, Xianping [2 ,3 ]
Xia, Li [1 ,3 ]
机构
[1] School of Business, Sun Yat-sen University, Guangzhou, China
[2] School of Mathematics, Sun Yat-sen University, Guangzhou, China
[3] Guangdong Province Key Laboratory of Computational Science, Sun Yat-sen University, Guangzhou, China
基金
中国国家自然科学基金;
关键词
D O I
10.1016/j.orl.2024.107187
中图分类号
学科分类号
摘要
This paper deals with a risk probability minimization problem for finite horizon partially observable semi-Markov decision processes, which are the fairly most general models for stochastic dynamic systems. In contrast to the expected discounted and average criteria, the optimality investigated in this paper is to minimize the probability that the accumulated rewards do not reach a prescribed profit level at the finite terminal stage. First, the state space is augmented as the joint conditional distribution of the current unobserved state and the remaining profit goal. We introduce a class of policies depending on observable histories and a class of Markov policies including observable process with the joint conditional distribution. Then under mild assumptions, we prove that the value function is the unique solution to the optimality equation for the probability criterion by using iteration techniques. The existence of (ϵ-)optimal Markov policy for this problem is established. Finally, we use a bandit problem with the probability criterion to demonstrate our main results in which an effective algorithm and the corresponding numerical calculation are given for the semi-Markov model. Moreover, for the case of reduction to the discrete-time Markov model, we derive a concise solution. © 2024 Elsevier B.V.
引用
收藏
相关论文
共 50 条
  • [1] The optimal probability of the risk for finite horizon partially observable Markov decision processes
    Wen, Xian
    Huo, Haifeng
    Cui, Jinhua
    AIMS MATHEMATICS, 2023, 8 (12): : 28435 - 28449
  • [2] Minimum risk probability for finite horizon semi-Markov decision processes
    Huang, Yonghui
    Guo, Xianping
    Li, Zhongfei
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2013, 402 (01) : 378 - 391
  • [3] PARTIALLY OBSERVABLE SEMI-MARKOV REWARD PROCESSES
    MASUDA, Y
    JOURNAL OF APPLIED PROBABILITY, 1993, 30 (03) : 548 - 560
  • [4] Finite horizon semi-Markov decision processes with multiple constraints
    Huang, Yonghui
    2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 1761 - 1768
  • [5] A RISK MINIMIZATION PROBLEM FOR FINITE HORIZON SEMI-MARKOV DECISION PROCESSES WITH LOSS RATES
    Liu, Qiuli
    Zou, Xiaolong
    JOURNAL OF DYNAMICS AND GAMES, 2018, 5 (02): : 143 - 163
  • [6] THE EXPONENTIAL COST OPTIMALITY FOR FINITE HORIZON SEMI-MARKOV DECISION PROCESSES
    Huo, Haifeng
    Wen, Xian
    KYBERNETIKA, 2022, 58 (03) : 301 - 319
  • [7] Non-Stationary Semi-Markov Decision Processes on a Finite Horizon
    Ghosh, Mrinal K.
    Saha, Subhamay
    STOCHASTIC ANALYSIS AND APPLICATIONS, 2013, 31 (01) : 183 - 190
  • [8] Finite horizon semi-Markov decision processes with application to maintenance systems
    Huang, Yonghui
    Guo, Xianping
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2011, 212 (01) : 131 - 140
  • [9] Continuous-Observation Partially Observable Semi-Markov Decision Processes for Machine Maintenance
    Zhang, Mimi
    Revie, Matthew
    IEEE TRANSACTIONS ON RELIABILITY, 2017, 66 (01) : 202 - 218
  • [10] Zero-Sum Games for Finite-Horizon Semi-Markov Processes Under the Probability Criterion
    Huang, Xiangxiang
    Guo, Xianping
    Wen, Xin
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (09) : 5560 - 5567