Risk-Constrained Markov Decision Processes

被引:36
|
作者
Borkar, Vivek [1 ]
Jain, Rahul [2 ,3 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Bombay 400076, Maharashtra, India
[2] Univ So Calif, Dept Elect Engn, Los Angeles, CA 90089 USA
[3] Univ So Calif, Dept Ind & Syst Engn, Los Angeles, CA 90089 USA
基金
美国国家科学基金会;
关键词
Constrained Markov decision processes; risk measures; stochastic approximations; OPTIMIZATION;
D O I
10.1109/TAC.2014.2309262
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a new constrained Markov decision process framework with risk-type constraints. The risk metric we use is Conditional Value-at-Risk (CVaR), which is gaining popularity in finance. It is a conditional expectation but the conditioning is defined in terms of the level of the tail probability. We propose an iterative offline algorithm to find the risk-contrained optimal control policy. A two time-scale stochastic approximation-inspired 'learning' variant is also sketched, and its convergence proved to the optimal risk-constrained policy.
引用
收藏
页码:2574 / 2579
页数:6
相关论文
共 50 条
  • [1] Risk-constrained Markov Decision Processes
    Borkar, Vivek
    Jain, Rahul
    [J]. 49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 2664 - 2669
  • [2] Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes
    Brazdil, Tomas
    Chatterjee, Krishnendu
    Novotny, Petr
    Vahala, Jiri
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9794 - 9801
  • [3] Constrained Risk-Averse Markov Decision Processes
    Ahmadi, Mohamadreza
    Rosolia, Ugo
    Ingham, Michel D.
    Murray, Richard M.
    Ames, Aaron D.
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 11718 - 11725
  • [4] On constrained Markov decision processes
    Haviv, M
    [J]. OPERATIONS RESEARCH LETTERS, 1996, 19 (01) : 25 - 28
  • [5] Learning in Constrained Markov Decision Processes
    Singh, Rahul
    Gupta, Abhishek
    Shroff, Ness B.
    [J]. IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (01): : 441 - 453
  • [6] Approximate solutions to constrained risk-sensitive Markov decision processes
    Kumar, Uday M.
    Bhat, Sanjay P.
    Kavitha, Veeraruna
    Hemachandra, Nandyala
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 310 (01) : 249 - 267
  • [7] Dynamic programming in constrained Markov decision processes
    Piunovskiy, A. B.
    [J]. CONTROL AND CYBERNETICS, 2006, 35 (03): : 645 - 660
  • [8] Robustness of policies in constrained Markov decision processes
    Zadorojniy, A
    Shwartz, A
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (04) : 635 - 638
  • [9] Reinforcement Learning for Constrained Markov Decision Processes
    Gattami, Ather
    Bai, Qinbo
    Aggarwal, Vaneet
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [10] Markov decision processes with constrained stopping times
    Horiguchi, M
    Kurano, M
    Yasuda, M
    [J]. PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 2000, : 706 - 710