Robust optimal policies for Markov decision processes with safety-threshold constraints

被引:0
|
作者
Dimitrova, Rayna [1 ,2 ]
Fu, Jie [3 ]
Topcu, Ufuk [4 ]
机构
[1] MPI SWS, Kaiserslautern, Germany
[2] MPI SWS, Saarbrucken, Germany
[3] Worcester Polytech Inst, Dept Elect & Comp Engn, Worcester, MA 01609 USA
[4] Univ Texas Austin, Dept Aerosp Engn & Engn Mech, Austin, TX 78712 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the synthesis of robust optimal control policies for Markov decision processes with transition uncertainty (UMDPs) and subject to two types of constraints: (i) constraints on the worst-case, maximal total cost and (ii) safety threshold constraints that bound the worst-case probability of visiting a set of error states. For maximal total cost constraints, we propose a state-augmentation method and a two-step synthesis algorithm to generate deterministic, memoryless optimal policies given the reward to be maximized. For safety threshold constraints, we introduce a new cost function and provide an approximately optimal solution by a reduction to an uncertain Markov decision process under a maximal total cost constraint. The safety-threshold constraints require memory and randomization for optimality. We discuss the use and the limitations of the proposed solution.
引用
收藏
页码:7081 / 7086
页数:6
相关论文
共 50 条
  • [1] Markov Decision Processes with Threshold Based Piecewise Linear Optimal Policies
    Erseghe, Tomaso
    Zanella, Andrea
    Codemo, Claudio G.
    [J]. IEEE WIRELESS COMMUNICATIONS LETTERS, 2013, 2 (04) : 459 - 462
  • [2] Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision Processes
    Roy, Arghyadip
    Borkar, Vivek
    Karandikar, Abhay
    Chaporkar, Prasanna
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (07) : 3722 - 3729
  • [3] Optimal Decision Tree Policies for Markov Decision Processes
    Vos, Daniel
    Verwer, Sicco
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5457 - 5465
  • [4] Optimal Policies for Quantum Markov Decision Processes
    Ying, Ming-Sheng
    Feng, Yuan
    Ying, Sheng-Gang
    [J]. INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2021, 18 (03) : 410 - 421
  • [5] IDENTIFICATION OF OPTIMAL POLICIES IN MARKOV DECISION PROCESSES
    Sladky, Karel
    [J]. KYBERNETIKA, 2010, 46 (03) : 558 - 570
  • [6] Optimal adaptive policies for Markov decision processes
    Burnetas, AN
    Katehakis, MN
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 1997, 22 (01) : 222 - 255
  • [7] Optimal Policies for Quantum Markov Decision Processes
    Ming-Sheng Ying
    Yuan Feng
    Sheng-Gang Ying
    [J]. Machine Intelligence Research, 2021, 18 (03) : 410 - 421
  • [8] Optimal Policies for Quantum Markov Decision Processes
    Ming-Sheng Ying
    Yuan Feng
    Sheng-Gang Ying
    [J]. International Journal of Automation and Computing, 2021, 18 : 410 - 421
  • [9] MONOTONE OPTIMAL POLICIES FOR MARKOV DECISION-PROCESSES
    SERFOZO, RF
    [J]. MATHEMATICAL PROGRAMMING STUDY, 1976, 6 (DEC): : 202 - 215
  • [10] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
    Daniel Cruz-Suárez
    Raúl Montes-de-Oca
    Francisco Salem-Silva
    [J]. Mathematical Methods of Operations Research, 2004, 60 : 415 - 436