Risk-Sensitivity Through Multi-Objective Reinforcement Learning

被引：0

作者：

Van Moffaert, Kristof ^{[1
]}

Brys, Tim ^{[1
]}

Nowe, Ann ^{[1
]}

机构：

[1] Vrije Univ Brussel, Dept Comp Sci, Brussels, Belgium

来源：

2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC) | 2015年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Usually in reinforcement learning, the goal of the agent is to maximize the expected return. However, in practical applications, algorithms that solely focus on maximizing the mean return could be inappropriate as they do not account for the variability of their solutions. Thereby, a variability measure could be included to accommodate for a risk-sensitive setting, i.e. where the system engineer can explicitly define the tolerated level of variance. Our approach is based on multi-objectivization where a standard single-objective environment is extended with one (or more) additional objectives. More precisely, we augment the standard feedback signal of an environment with an additional objective that defines the variance of the solution. We highlight that our algorithm, named risk-sensitive Pareto Qlearning, is (1) specifically tailored to learn a set of Pareto non-dominated policies that trade-off these two objectives. Additionally (2), the algorithm can also retrieve every policy that has been learned throughout the state-action space. This in contrast to standard risk-sensitive approaches where only a single trade-off between mean and variance is learned at a time.

引用

页码：1746 / 1753

页数：8

共 50 条

[21] A Multi-objective Reinforcement Learning Algorithm for JS']JSSP
Mendez-Hernandez, Beatriz M.
Rodriguez-Bazan, Erick D.
Martinez-Jimenez, Yailen
Libin, Pieter
Nowe, Ann
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: THEORETICAL NEURAL COMPUTATION, PT I, 2019, 11727 : 567 - 584
[22] Dynamic Weights in Multi-Objective Deep Reinforcement Learning
Abels, Axel
Roijers, Diederik M.
Lenaerts, Tom
Nowe, Ann
Steckelmacher, Denis
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[23] Multi-Objective Order Scheduling via Reinforcement Learning
Chen, Sirui
Tian, Yuming
An, Lingling
[J]. ALGORITHMS, 2023, 16 (11)
[24] Urban Driving with Multi-Objective Deep Reinforcement Learning
Li, Changjian
Czarnecki, Krzysztof
[J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 359 - 367
[25] A temporal difference method for multi-objective reinforcement learning
Ruiz-Montiel, Manuela
Mandow, Lawrence
Perez-de-la-Cruz, Jose-Luis
[J]. NEUROCOMPUTING, 2017, 263 : 15 - 25
[26] Taming Lagrangian chaos with multi-objective reinforcement learning
Chiara Calascibetta
Luca Biferale
Francesco Borra
Antonio Celani
Massimo Cencini
[J]. The European Physical Journal E, 2023, 46
[27] Model-Based Multi-Objective Reinforcement Learning
Wiering, Marco A.
Withagen, Maikel
Drugan, Madalina M.
[J]. 2014 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2014, : 111 - 116
[28] Multi-objective reinforcement learning approach for trip recommendation
Chen, Lei
Zhu, Guixiang
Liang, Weichao
Wang, Youquan
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 226
[29] A reinforcement learning approach for dynamic multi-objective optimization
Zou, Fei
Yen, Gary G.
Tang, Lixin
Wang, Chunfeng
[J]. INFORMATION SCIENCES, 2021, 546 : 815 - 834
[30] Multi-objective Genetic Programming for Explainable Reinforcement Learning
Videau, Mathurin
Leite, Alessandro
Teytaud, Olivier
Schoenauer, Marc
[J]. GENETIC PROGRAMMING (EUROGP 2022), 2022, : 278 - 293

← 1 2 3 4 5 →