Neuroevolutionary diversity policy search for multi-objective reinforcement learning

被引：2

作者：

Zhou, Dan ^{[1
]}

Du, Jiqing ^{[1
]}

Arai, Sachiyo ^{[1
]}

机构：

[1] Chiba Univ, Grad Sch Sci & Engn, Dept Urban Environm Syst, Div Earth & Environm Sci, Chiba, Japan

来源：

INFORMATION SCIENCES | 2024年 / 657卷

基金：

日本科学技术振兴机构;

关键词：

Multi-objective reinforcement learning; Pareto front; Policy search; Multi-objective evolutionary algorithm; Diversity; ALGORITHM; DESIGN;

D O I：

10.1016/j.ins.2023.119932

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Sequential decision-making requires balancing multiple conflicting objectives through multi objective reinforcement learning (MORL). Moreover, decision-makers desire dense solutions that satisfy their requirements and consider the trade-offs between different objectives (Pareto optimal solutions). Most deep reinforcement learning methods focus on single-objective problems or solve multi-objective problems using simple linear combinations, which may oversimplify the underlying problem and lead to suboptimal results. This study proposes a neuroevolutionary diversity policy search approach to address MORL problems. It employs neural networks, each equipped with a buffer for storing recent experiences, representing individuals in a population. The non-dominated sorting method and diversity distance metric are employed in the evolutionary process to select high-quality solutions as teachers. The teachers use gradient-based genetic operators to guide the population to produce high-quality offspring, thereby achieving dense Pareto optimal solutions. Furthermore, we introduce three MORL benchmarks with distinct characteristics: (1) a continuous deep sea treasure with convex and nonconvex Pareto fronts; (2) a multi-objective mountain car with sparse rewards and a discontinuous Pareto front; and (3) a multi-objective HalfCheetah with high-dimensional action-state spaces. The experimental results on the three MORL benchmarks demonstrate the superiority of the proposed algorithm in obtaining dense and high-quality Pareto optimal solutions.

引用

页数：17

共 50 条

[1] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
Horie, Naoto
Matsui, Tohgoroh
Moriyama, Koichi
Mutoh, Atsuko
Inuzuka, Nobuhiro
ARTIFICIAL LIFE AND ROBOTICS, 2019, 24 (03) : 352 - 359
[2] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
Naoto Horie
Tohgoroh Matsui
Koichi Moriyama
Atsuko Mutoh
Nobuhiro Inuzuka
Artificial Life and Robotics, 2019, 24 : 352 - 359
[3] Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning
Kim, Man-Je
Park, Hyunsoo
Ahn, Chang Wook
ELECTRONICS, 2022, 11 (07)
[4] Multi-objective Reinforcement Learning with Path Integral Policy Improvement
Ariizumi, Ryo
Sago, Hayato
Asai, Toru
Azuma, Shun-ichi
2023 62ND ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS, SICE, 2023, : 1418 - 1423
[5] A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation
Yang, Runzhe
Sun, Xingyuan
Narasimhan, Karthik
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[6] Policy invariance under reward transformations for multi-objective reinforcement learning
Mannion, Patrick
Devlin, Sam
Mason, Karl
Duggan, Jim
Howley, Enda
NEUROCOMPUTING, 2017, 263 : 60 - 73
[7] Safety Optimized Reinforcement Learning via Multi-Objective Policy Optimization
Honari, Homayoun
Tamizi, Mehran Ghafarian
Najjaran, Homayoun
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 2873 - 2879
[8] Local-utopia Policy Selection for Multi-objective Reinforcement Learning
Parisi, Simone
Blank, Alexander
Viernicke, Tobias
Peters, Jan
PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
[9] Multi-objective ω-Regular Reinforcement Learning
Hahn, Ernst Moritz
Perez, Mateo
Schewe, Sven
Somenzi, Fabio
Trivedi, Ashutosh
Wojtczak, Dominik
FORMAL ASPECTS OF COMPUTING, 2023, 35 (02)
[10] Federated multi-objective reinforcement learning
Zhao, Fangyuan
Ren, Xuebin
Yang, Shusen
Zhao, Peng
Zhang, Rui
Xu, Xinxin
INFORMATION SCIENCES, 2023, 624 : 811 - 832

← 1 2 3 4 5 →