Robust Risk-Sensitive Reinforcement Learning with Conditional Value-at-Risk

被引:0
|
作者
Ni, Xinyi [1 ]
Lai, Lifeng [1 ]
机构
[1] Univ Calif Davis, Elect & Comp Engn, Davis, CA USA
来源
2024 IEEE INFORMATION THEORY WORKSHOP, ITW 2024 | 2024年
基金
美国国家科学基金会;
关键词
ambiguity sets; RMDP; risk-sensitive RL; CVaR; OPTIMIZATION;
D O I
10.1109/ITW61385.2024.10806953
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Robust Markov Decision Processes (RMDPs) have received significant research interest, offering an alternative to standard Markov Decision Processes (MDPs) that often assume fixed transition probabilities. RMDPs address this by optimizing for the worst-case scenarios within ambiguity sets. While earlier studies on RMDPs have largely centered on risk-neutral reinforcement learning (RL), with the goal of minimizing expected total discounted costs, in this paper, we analyze the robustness of CVaR-based risk-sensitive RL under RMDP. Firstly, we consider predetermined ambiguity sets. Based on the coherency of CVaR, we establish a connection between robustness and risk sensitivity, thus, techniques in risk-sensitive RL can be adopted to solve the proposed problem. Furthermore, motivated by the existence of decision-dependent uncertainty in real-world problems, we study problems with state-action-dependent ambiguity sets. To solve this, we define a new risk measure named NCVaR and build the equivalence of NCVaR optimization and robust CVaR optimization. We further propose value iteration algorithms and validate our approach in simulation experiments.
引用
收藏
页码:520 / 525
页数:6
相关论文
共 50 条
  • [21] Reinforcement Learning With Data Envelopment Analysis and Conditional Value-At-Risk for the Capacity Expansion Problem
    Lee, Chia-Yen
    Chen, Yen-Wen
    IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, 2023, 71 : 6469 - 6480
  • [22] Risk-sensitive Inverse Reinforcement Learning via Coherent Risk Models
    Majumdar, Anirudha
    Singh, Sumeet
    Mandlekar, Ajay
    Pavone, Marco
    ROBOTICS: SCIENCE AND SYSTEMS XIII, 2017,
  • [23] Monte Carlo Methods for Value-at-Risk and Conditional Value-at-Risk: A Review
    Hong, L. Jeff
    Hu, Zhaolin
    Liu, Guangwu
    ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2014, 24 (04):
  • [24] Risk-sensitive Distributional Reinforcement Learning for Flight Control
    Seres, Peter
    Liu, Cheng
    van Kampen, Erik-Jan
    IFAC PAPERSONLINE, 2023, 56 (02): : 2013 - 2018
  • [25] Analytical method for computing stressed value-at-risk with conditional value-at-risk
    Hong, KiHoon
    JOURNAL OF RISK, 2017, 19 (03): : 85 - 106
  • [26] A SEQUENTIAL ELIMINATION APPROACH TO VALUE-AT-RISK AND CONDITIONAL VALUE-AT-RISK SELECTION
    Hepworth, Adam J.
    Atkinson, Michael P.
    Szechtman, Roberto
    2017 WINTER SIMULATION CONFERENCE (WSC), 2017, : 2324 - 2335
  • [27] MONTE CARLO ESTIMATION OF VALUE-AT-RISK, CONDITIONAL VALUE-AT-RISK AND THEIR SENSITIVITIES
    Hong, L. Jeff
    Liu, Guangwu
    PROCEEDINGS OF THE 2011 WINTER SIMULATION CONFERENCE (WSC), 2011, : 95 - 107
  • [28] Risk-Sensitive Inhibitory Control for Safe Reinforcement Learning
    Lederer, Armin
    Noorani, Erfaun
    Baras, John S.
    Hirche, Sandra
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 1040 - 1045
  • [29] Distributional Model Equivalence for Risk-Sensitive Reinforcement Learning
    Kastner, Tyler
    Erdogdu, Murat A.
    Farahmand, Amir-massoud
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [30] A GENERAL FRAMEWORK OF IMPORTANCE SAMPLING FOR VALUE-AT-RISK AND CONDITIONAL VALUE-AT-RISK
    Sun, Lihua
    Hong, L. Jeff
    PROCEEDINGS OF THE 2009 WINTER SIMULATION CONFERENCE (WSC 2009 ), VOL 1-4, 2009, : 415 - 422