Risk-Sensitivity Through Multi-Objective Reinforcement Learning

被引:0
|
作者
Van Moffaert, Kristof [1 ]
Brys, Tim [1 ]
Nowe, Ann [1 ]
机构
[1] Vrije Univ Brussel, Dept Comp Sci, Brussels, Belgium
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Usually in reinforcement learning, the goal of the agent is to maximize the expected return. However, in practical applications, algorithms that solely focus on maximizing the mean return could be inappropriate as they do not account for the variability of their solutions. Thereby, a variability measure could be included to accommodate for a risk-sensitive setting, i.e. where the system engineer can explicitly define the tolerated level of variance. Our approach is based on multi-objectivization where a standard single-objective environment is extended with one (or more) additional objectives. More precisely, we augment the standard feedback signal of an environment with an additional objective that defines the variance of the solution. We highlight that our algorithm, named risk-sensitive Pareto Qlearning, is (1) specifically tailored to learn a set of Pareto non-dominated policies that trade-off these two objectives. Additionally (2), the algorithm can also retrieve every policy that has been learned throughout the state-action space. This in contrast to standard risk-sensitive approaches where only a single trade-off between mean and variance is learned at a time.
引用
收藏
页码:1746 / 1753
页数:8
相关论文
共 50 条
  • [21] A Multi-objective Reinforcement Learning Algorithm for JS']JSSP
    Mendez-Hernandez, Beatriz M.
    Rodriguez-Bazan, Erick D.
    Martinez-Jimenez, Yailen
    Libin, Pieter
    Nowe, Ann
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: THEORETICAL NEURAL COMPUTATION, PT I, 2019, 11727 : 567 - 584
  • [22] Dynamic Weights in Multi-Objective Deep Reinforcement Learning
    Abels, Axel
    Roijers, Diederik M.
    Lenaerts, Tom
    Nowe, Ann
    Steckelmacher, Denis
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [23] Multi-Objective Order Scheduling via Reinforcement Learning
    Chen, Sirui
    Tian, Yuming
    An, Lingling
    [J]. ALGORITHMS, 2023, 16 (11)
  • [24] Urban Driving with Multi-Objective Deep Reinforcement Learning
    Li, Changjian
    Czarnecki, Krzysztof
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 359 - 367
  • [25] A temporal difference method for multi-objective reinforcement learning
    Ruiz-Montiel, Manuela
    Mandow, Lawrence
    Perez-de-la-Cruz, Jose-Luis
    [J]. NEUROCOMPUTING, 2017, 263 : 15 - 25
  • [26] Taming Lagrangian chaos with multi-objective reinforcement learning
    Chiara Calascibetta
    Luca Biferale
    Francesco Borra
    Antonio Celani
    Massimo Cencini
    [J]. The European Physical Journal E, 2023, 46
  • [27] Model-Based Multi-Objective Reinforcement Learning
    Wiering, Marco A.
    Withagen, Maikel
    Drugan, Madalina M.
    [J]. 2014 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2014, : 111 - 116
  • [28] Multi-objective reinforcement learning approach for trip recommendation
    Chen, Lei
    Zhu, Guixiang
    Liang, Weichao
    Wang, Youquan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 226
  • [29] A reinforcement learning approach for dynamic multi-objective optimization
    Zou, Fei
    Yen, Gary G.
    Tang, Lixin
    Wang, Chunfeng
    [J]. INFORMATION SCIENCES, 2021, 546 : 815 - 834
  • [30] Multi-objective Genetic Programming for Explainable Reinforcement Learning
    Videau, Mathurin
    Leite, Alessandro
    Teytaud, Olivier
    Schoenauer, Marc
    [J]. GENETIC PROGRAMMING (EUROGP 2022), 2022, : 278 - 293