Distributional Reinforcement Learning for Risk-Sensitive Policies

被引:0
|
作者
Lim, Shiau Hong [1 ]
Malik, Ilyas [1 ]
机构
[1] IBM Res, Singapore, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address the problem of learning a risk-sensitive policy based on the CVaR risk measure using distributional reinforcement learning. In particular, we show that the standard action-selection strategy when applying the distributional Bellman optimality operator can result in convergence to neither the dynamic, Markovian CVaR nor the static, non-Markovian CVaR. We propose modifications to the existing algorithms that include a new distributional Bellman operator and show that the proposed strategy greatly expands the utility of distributional RL in learning and representing CVaR-optimized policies. Our proposed approach is a simple extension of standard distributional RL algorithms and can therefore take advantage of many of the recent advances in deep RL. On both synthetic and real data, we empirically show that our proposed algorithm is able to learn better CVaR-optimized policies.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Risk-Sensitive Policy with Distributional Reinforcement Learning
    Theate, Thibaut
    Ernst, Damien
    ALGORITHMS, 2023, 16 (07)
  • [2] Risk-sensitive Distributional Reinforcement Learning for Flight Control
    Seres, Peter
    Liu, Cheng
    van Kampen, Erik-Jan
    IFAC PAPERSONLINE, 2023, 56 (02): : 2013 - 2018
  • [3] Distributional Model Equivalence for Risk-Sensitive Reinforcement Learning
    Kastner, Tyler
    Erdogdu, Murat A.
    Farahmand, Amir-massoud
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Risk-Sensitive Portfolio Management by using Distributional Reinforcement Learning
    Harnpadungkij, Thammasorn
    Chaisangmongkon, Warasinee
    Phunchongharn, Phond
    2019 IEEE 10TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST 2019), 2019, : 110 - 115
  • [5] Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
    Liang, Hao
    Luo, Zhi-Quan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [6] RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents
    Qiu, Wei
    Wang, Xinrun
    Yu, Runsheng
    He, Xu
    Wang, Rundong
    An, Bo
    Obraztsova, Svetlana
    Rabinovich, Zinovi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] Risk-Sensitive Reinforcement Learning
    Shen, Yun
    Tobia, Michael J.
    Sommer, Tobias
    Obermayer, Klaus
    NEURAL COMPUTATION, 2014, 26 (07) : 1298 - 1328
  • [8] Risk-sensitive reinforcement learning
    Mihatsch, O
    Neuneier, R
    MACHINE LEARNING, 2002, 49 (2-3) : 267 - 290
  • [9] Risk-Sensitive Reinforcement Learning
    Oliver Mihatsch
    Ralph Neuneier
    Machine Learning, 2002, 49 : 267 - 290
  • [10] Inverse Risk-Sensitive Reinforcement Learning
    Ratliff, Lillian J.
    Mazumdar, Eric
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (03) : 1256 - 1263