Risk-Sensitive Policy with Distributional Reinforcement Learning

被引:2
|
作者
Theate, Thibaut [1 ]
Ernst, Damien [1 ,2 ]
机构
[1] Univ Liege, Dept Elect Engn & Comp Sci, B-4031 Liege, Belgium
[2] Inst Polytech Paris, Informat Proc & Commun Lab, F-91120 Paris, France
关键词
distributional reinforcement learning; sequential decision-making; risk-sensitive policy; risk management; deep neural network;
D O I
10.3390/a16070325
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classical reinforcement learning (RL) techniques are generally concerned with the design of decision-making policies driven by the maximisation of the expected outcome. Nevertheless, this approach does not take into consideration the potential risk associated with the actions taken, which may be critical in certain applications. To address that issue, the present research work introduces a novel methodology based on distributional RL to derive sequential decision-making policies that are sensitive to the risk, the latter being modelled by the tail of the return probability distribution. The core idea is to replace the Q function generally standing at the core of learning schemes in RL by another function, taking into account both the expected return and the risk. Named the risk-based utility function U, it can be extracted from the random return distribution Z naturally learnt by any distributional RL algorithm. This enables the spanning of the complete potential trade-off between risk minimisation and expected return maximisation, in contrast to fully risk-averse methodologies. Fundamentally, this research yields a truly practical and accessible solution for learning risk-sensitive policies with minimal modification to the distributional RL algorithm, with an emphasis on the interpretability of the resulting decision-making process.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Mean-variance Based Risk-sensitive Reinforcement Learning with Interpretable Attention
    Kim, Woo Kyung
    Lee, Youngseok
    Woo, Honguk
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, ICMVA 2022, 2022, : 104 - 109
  • [42] RiskQ: Risk-sensitive Multi-Agent Reinforcement Learning Value Factorization
    Shen, Siqi
    Ma, Chennan
    Li, Chao
    Liu, Weiquan
    Fu, Yongquan
    Mei, Songzhu
    Liu, Xinwang
    Wang, Cheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [43] Sample-Efficient Multimodal Dynamics Modeling for Risk-Sensitive Reinforcement Learning
    Yashima, Ryota
    Yamaguchi, Akihiko
    Hashimoto, Koichi
    2022 8TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND ROBOTICS ENGINEERING (ICMRE 2022), 2022, : 21 - 27
  • [44] Risk-Sensitive Autonomous Exploration of Unknown Environments: A Deep Reinforcement Learning Perspective
    Sarfi, Mohammad Hossein
    Bisheban, Mahdis
    Journal of Intelligent and Robotic Systems: Theory and Applications, 2025, 111 (01):
  • [45] Sample-Efficient Multimodal Dynamics Modeling for Risk-Sensitive Reinforcement Learning
    Yashima, Ryota
    Yamaguchi, Akihiko
    Hashimoto, Koichi
    2022 8th International Conference on Mechatronics and Robotics Engineering, ICMRE 2022, 2022, : 21 - 27
  • [46] Embracing Risk in Reinforcement Learning: The Connection between Risk-Sensitive Exponential and Distributionally Robust Criteria
    Noorani, Erfaun
    Baras, John S.
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 2703 - 2708
  • [47] Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
    Fei, Yingjie
    Yang, Zhuoran
    Chen, Yudong
    Wang, Zhaoran
    Xie, Qiaomin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [48] A Learning Algorithm for Risk-Sensitive Cost
    Basu, Arnab
    Bhattacharyya, Tirthankar
    Borkar, Vivek S.
    MATHEMATICS OF OPERATIONS RESEARCH, 2008, 33 (04) : 880 - 898
  • [49] Influence of budget and reinforcement location on risk-sensitive preference
    O'Daly, Matthew
    Case, David A.
    Fantino, Edmund
    BEHAVIOURAL PROCESSES, 2006, 73 (02) : 125 - 135
  • [50] Risk-Sensitive Mobile Robot Navigation in Crowded Environment via Offline Reinforcement Learning
    Wu, Jiaxu
    Wang, Yusheng
    Asama, Hajime
    An, Qi
    Yamashita, Atsushi
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 7456 - 7462