Risk-Sensitivity Through Multi-Objective Reinforcement Learning

被引:0
|
作者
Van Moffaert, Kristof [1 ]
Brys, Tim [1 ]
Nowe, Ann [1 ]
机构
[1] Vrije Univ Brussel, Dept Comp Sci, Brussels, Belgium
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Usually in reinforcement learning, the goal of the agent is to maximize the expected return. However, in practical applications, algorithms that solely focus on maximizing the mean return could be inappropriate as they do not account for the variability of their solutions. Thereby, a variability measure could be included to accommodate for a risk-sensitive setting, i.e. where the system engineer can explicitly define the tolerated level of variance. Our approach is based on multi-objectivization where a standard single-objective environment is extended with one (or more) additional objectives. More precisely, we augment the standard feedback signal of an environment with an additional objective that defines the variance of the solution. We highlight that our algorithm, named risk-sensitive Pareto Qlearning, is (1) specifically tailored to learn a set of Pareto non-dominated policies that trade-off these two objectives. Additionally (2), the algorithm can also retrieve every policy that has been learned throughout the state-action space. This in contrast to standard risk-sensitive approaches where only a single trade-off between mean and variance is learned at a time.
引用
收藏
页码:1746 / 1753
页数:8
相关论文
共 50 条
  • [1] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
    Horie, Naoto
    Matsui, Tohgoroh
    Moriyama, Koichi
    Mutoh, Atsuko
    Inuzuka, Nobuhiro
    [J]. ARTIFICIAL LIFE AND ROBOTICS, 2019, 24 (03) : 352 - 359
  • [2] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
    Naoto Horie
    Tohgoroh Matsui
    Koichi Moriyama
    Atsuko Mutoh
    Nobuhiro Inuzuka
    [J]. Artificial Life and Robotics, 2019, 24 : 352 - 359
  • [3] Learning adversarial attack policies through multi-objective reinforcement learning
    Garcia, Javier
    Majadas, Ruben
    Fernandez, Fernando
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 96
  • [4] Multi-objective ω-Regular Reinforcement Learning
    Hahn, Ernst Moritz
    Perez, Mateo
    Schewe, Sven
    Somenzi, Fabio
    Trivedi, Ashutosh
    Wojtczak, Dominik
    [J]. FORMAL ASPECTS OF COMPUTING, 2023, 35 (02)
  • [5] Federated multi-objective reinforcement learning
    Zhao, Fangyuan
    Ren, Xuebin
    Yang, Shusen
    Zhao, Peng
    Zhang, Rui
    Xu, Xinxin
    [J]. INFORMATION SCIENCES, 2023, 624 : 811 - 832
  • [6] Multi-Objective Optimisation by Reinforcement Learning
    Liao, H. L.
    Wu, Q. H.
    [J]. 2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [7] Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation
    Parisi, Simone
    Pirotta, Matteo
    Restelli, Marcello
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2016, 57 : 187 - 227
  • [8] Meta-Learning for Multi-objective Reinforcement Learning
    Chen, Xi
    Ghadirzadeh, Ali
    Bjorkman, Marten
    Jensfelt, Patric
    [J]. 2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 977 - 983
  • [9] Multi-objective Reinforcement Learning for Responsive Grids
    Perez, Julien
    Germain-Renaud, Cecile
    Kegl, Balazs
    Loomis, Charles
    [J]. JOURNAL OF GRID COMPUTING, 2010, 8 (03) : 473 - 492
  • [10] Special issue on multi-objective reinforcement learning
    Drugan, Madalina
    Wiering, Marco
    Vamplew, Peter
    Chetty, Madhu
    [J]. NEUROCOMPUTING, 2017, 263 : 1 - 2