Risk-Sensitive Policy with Distributional Reinforcement Learning

被引：2

作者：

Theate, Thibaut ^{[1
]}

Ernst, Damien ^{[1
,2
]}

机构：

[1] Univ Liege, Dept Elect Engn & Comp Sci, B-4031 Liege, Belgium

[2] Inst Polytech Paris, Informat Proc & Commun Lab, F-91120 Paris, France

来源：

ALGORITHMS | 2023年 / 16卷 / 07期

关键词：

distributional reinforcement learning; sequential decision-making; risk-sensitive policy; risk management; deep neural network;

D O I：

10.3390/a16070325

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Classical reinforcement learning (RL) techniques are generally concerned with the design of decision-making policies driven by the maximisation of the expected outcome. Nevertheless, this approach does not take into consideration the potential risk associated with the actions taken, which may be critical in certain applications. To address that issue, the present research work introduces a novel methodology based on distributional RL to derive sequential decision-making policies that are sensitive to the risk, the latter being modelled by the tail of the return probability distribution. The core idea is to replace the Q function generally standing at the core of learning schemes in RL by another function, taking into account both the expected return and the risk. Named the risk-based utility function U, it can be extracted from the random return distribution Z naturally learnt by any distributional RL algorithm. This enables the spanning of the complete potential trade-off between risk minimisation and expected return maximisation, in contrast to fully risk-averse methodologies. Fundamentally, this research yields a truly practical and accessible solution for learning risk-sensitive policies with minimal modification to the distributional RL algorithm, with an emphasis on the interpretability of the resulting decision-making process.

引用

页数：16

共 50 条

[21] Risk-sensitive reinforcement learning algorithms with generalized average criterion
Chang-ming Yin
Wang Han-xing
Zhao Fei
Applied Mathematics and Mechanics, 2007, 28 : 405 - 416
[22] Gradient-Based Inverse Risk-Sensitive Reinforcement Learning
Mazumdar, Eric
Ratliff, Lillian J.
Fiez, Tanner
Sastry, S. Shankar
2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
[23] Risk-sensitive reinforcement learning applied to control under constraints
Geibel, P
Wysotzki, F
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2005, 24 : 81 - 108
[24] Risk-sensitive reinforcement learning applied to control under constraints
Geibel, P. (PGEIBEL@UOS.DE), 1600, American Association for Artificial Intelligence (24):
[25] Risk-Sensitive Reinforcement Learning for URLLC Traffic in Wireless Networks
Ben Khalifa, Nesrine
Assaad, Mohamad
Debbah, Merouane
2019 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2019,
[26] Risk-sensitive reinforcement learning algorithms with generalized average criterion
Yin Chang-ming
Wang Han-xing
Zhao Fei
APPLIED MATHEMATICS AND MECHANICS-ENGLISH EDITION, 2007, 28 (03) : 405 - 416
[27] Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach
Fei, Yingjie
Yang, Zhuoran
Wang, Zhaoran
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[28] Off-Policy Risk-Sensitive Reinforcement Learning-Based Constrained Robust Optimal Control
Li, Cong
Liu, Qingchen
Zhou, Zhehua
Buss, Martin
Liu, Fangzhou
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (04): : 2478 - 2491
[29] Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz Dynamic Risk Measures
Liang, Hao
Luo, Zhi-Quan
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[30] Risk-Sensitive Reinforcement Learning Part I: Constrained Optimization Framework
Prashanth, L. A.
2019 FIFTH INDIAN CONTROL CONFERENCE (ICC), 2019, : 9 - 9

← 1 2 3 4 5 →