Robust Risk-Sensitive Reinforcement Learning with Conditional Value-at-Risk

被引:0
|
作者
Ni, Xinyi [1 ]
Lai, Lifeng [1 ]
机构
[1] Univ Calif Davis, Elect & Comp Engn, Davis, CA USA
来源
2024 IEEE INFORMATION THEORY WORKSHOP, ITW 2024 | 2024年
基金
美国国家科学基金会;
关键词
ambiguity sets; RMDP; risk-sensitive RL; CVaR; OPTIMIZATION;
D O I
10.1109/ITW61385.2024.10806953
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Robust Markov Decision Processes (RMDPs) have received significant research interest, offering an alternative to standard Markov Decision Processes (MDPs) that often assume fixed transition probabilities. RMDPs address this by optimizing for the worst-case scenarios within ambiguity sets. While earlier studies on RMDPs have largely centered on risk-neutral reinforcement learning (RL), with the goal of minimizing expected total discounted costs, in this paper, we analyze the robustness of CVaR-based risk-sensitive RL under RMDP. Firstly, we consider predetermined ambiguity sets. Based on the coherency of CVaR, we establish a connection between robustness and risk sensitivity, thus, techniques in risk-sensitive RL can be adopted to solve the proposed problem. Furthermore, motivated by the existence of decision-dependent uncertainty in real-world problems, we study problems with state-action-dependent ambiguity sets. To solve this, we define a new risk measure named NCVaR and build the equivalence of NCVaR optimization and robust CVaR optimization. We further propose value iteration algorithms and validate our approach in simulation experiments.
引用
收藏
页码:520 / 525
页数:6
相关论文
共 50 条
  • [41] Risk-sensitive reinforcement learning algorithms with generalized average criterion
    Chang-ming Yin
    Wang Han-xing
    Zhao Fei
    Applied Mathematics and Mechanics, 2007, 28 : 405 - 416
  • [42] Gradient-Based Inverse Risk-Sensitive Reinforcement Learning
    Mazumdar, Eric
    Ratliff, Lillian J.
    Fiez, Tanner
    Sastry, S. Shankar
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [43] Risk-Sensitive Reinforcement Learning via Policy Gradient Search
    Prashanth, L. A.
    Fu, Michael C.
    FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2022, 15 (05): : 537 - 693
  • [44] Risk-sensitive reinforcement learning algorithms with generalized average criterion
    Yin Chang-ming
    Wang Han-xing
    Zhao Fei
    APPLIED MATHEMATICS AND MECHANICS-ENGLISH EDITION, 2007, 28 (03) : 405 - 416
  • [45] Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach
    Fei, Yingjie
    Yang, Zhuoran
    Wang, Zhaoran
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [46] Risk-sensitive reinforcement learning applied to control under constraints
    Geibel, P
    Wysotzki, F
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2005, 24 : 81 - 108
  • [47] The Regularization Aspect of Optimal-Robust Conditional Value-at-Risk Portfolios
    Fertis, Apostolos
    Baes, Michel
    Luethi, Hans-Jakob
    OPERATIONS RESEARCH PROCEEDINGS 2011, 2012, : 173 - 178
  • [48] Risk-Sensitive Reinforcement Learning for URLLC Traffic in Wireless Networks
    Ben Khalifa, Nesrine
    Assaad, Mohamad
    Debbah, Merouane
    2019 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2019,
  • [49] Risk-Sensitive Portfolio Management by using Distributional Reinforcement Learning
    Harnpadungkij, Thammasorn
    Chaisangmongkon, Warasinee
    Phunchongharn, Phond
    2019 IEEE 10TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST 2019), 2019, : 110 - 115
  • [50] Risk-sensitive reinforcement learning applied to control under constraints
    Geibel, P. (PGEIBEL@UOS.DE), 1600, American Association for Artificial Intelligence (24):