Robust Risk-Sensitive Reinforcement Learning with Conditional Value-at-Risk

被引:0
|
作者
Ni, Xinyi [1 ]
Lai, Lifeng [1 ]
机构
[1] Univ Calif Davis, Elect & Comp Engn, Davis, CA USA
来源
2024 IEEE INFORMATION THEORY WORKSHOP, ITW 2024 | 2024年
基金
美国国家科学基金会;
关键词
ambiguity sets; RMDP; risk-sensitive RL; CVaR; OPTIMIZATION;
D O I
10.1109/ITW61385.2024.10806953
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Robust Markov Decision Processes (RMDPs) have received significant research interest, offering an alternative to standard Markov Decision Processes (MDPs) that often assume fixed transition probabilities. RMDPs address this by optimizing for the worst-case scenarios within ambiguity sets. While earlier studies on RMDPs have largely centered on risk-neutral reinforcement learning (RL), with the goal of minimizing expected total discounted costs, in this paper, we analyze the robustness of CVaR-based risk-sensitive RL under RMDP. Firstly, we consider predetermined ambiguity sets. Based on the coherency of CVaR, we establish a connection between robustness and risk sensitivity, thus, techniques in risk-sensitive RL can be adopted to solve the proposed problem. Furthermore, motivated by the existence of decision-dependent uncertainty in real-world problems, we study problems with state-action-dependent ambiguity sets. To solve this, we define a new risk measure named NCVaR and build the equivalence of NCVaR optimization and robust CVaR optimization. We further propose value iteration algorithms and validate our approach in simulation experiments.
引用
收藏
页码:520 / 525
页数:6
相关论文
共 50 条
  • [31] Kendall Conditional Value-at-Risk
    Durante, Fabrizio
    Gatto, Aurora
    Perrone, Elisa
    MATHEMATICAL AND STATISTICAL METHODS FOR ACTUARIAL SCIENCES AND FINANCE, MAF 2022, 2022, : 222 - 227
  • [32] Risk Factor Beta Conditional Value-at-Risk
    Semenov, Andrei
    JOURNAL OF FORECASTING, 2009, 28 (06) : 549 - 558
  • [33] Robust risk-sensitive control
    Hua, Haochen
    Gashi, Bujar
    Zhang, Moyu
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2023, 33 (10) : 5484 - 5509
  • [34] On Value-at-Risk and Conditional Value-at-Risk Measures for Intuitionistic and Picture Fuzzy Losses
    Akdemir, Hande Gunay
    Kocken, Hale Gonce
    Kara, Nurdan
    JOURNAL OF MULTIPLE-VALUED LOGIC AND SOFT COMPUTING, 2023, 41 (06) : 583 - 617
  • [35] Recursive Computation of Value-at-Risk and Conditional Value-at-Risk using MC and QMC
    Bardou, Olivier
    Frikha, Noufel
    Pages, Gilles
    MONTE CARLO AND QUASI-MONTE CARLO METHODS 2008, 2009, : 193 - 208
  • [36] Robust Energy Resource Management Incorporating Risk Analysis Using Conditional Value-at-Risk
    Almeida, Jose
    Soares, Joao
    Lezama, Fernando
    Vale, Zita
    IEEE ACCESS, 2022, 10 : 16063 - 16077
  • [37] Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz Dynamic Risk Measures
    Liang, Hao
    Luo, Zhi-Quan
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [38] RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents
    Qiu, Wei
    Wang, Xinrun
    Yu, Runsheng
    He, Xu
    Wang, Rundong
    An, Bo
    Obraztsova, Svetlana
    Rabinovich, Zinovi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [39] Risk-sensitive reinforcement learning algorithms with generalized average criterion
    殷苌茗
    王汉兴
    赵飞
    AppliedMathematicsandMechanics(EnglishEdition), 2007, (03) : 405 - 416
  • [40] State-Augmentation Transformations for Risk-Sensitive Reinforcement Learning
    Ma, Shuai
    Yu, Jia Yuan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4512 - 4519