Multi-objective Reinforcement Learning with Path Integral Policy Improvement

被引：0

作者：

Ariizumi, Ryo ^{[1
]}

Sago, Hayato ^{[2
]}

Asai, Toru ^{[2
]}

Azuma, Shun-ichi

机构：

[1] Tokyo Univ Agr & Technol, Dept Mech Syst Engn, Tokyo, Japan

[2] Nagoya Univ, Grad Sch Engn, Nagoya, Japan

来源：

2023 62ND ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS, SICE | 2023年

关键词：

Multi-objective reinforcement learning; policy improvement;

D O I：

10.23919/SICE59929.2023.10354223

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-objective reinforcement learning (MORL) for robot motion learning is a challenging problem not only because of the scarcity of the data but also of the high-dimensional and continuous state and action spaces. Most existing MORL algorithms are inadequate in this regard. However, in single-objective reinforcement learning, policy-based algorithms have solved the problem of high-dimensional and continuous state and action spaces. Among such algorithms is policy improvement with path integral (PI2), which has been successful in robot motion learning. PI2 is similar to evolution strategies (ES), and multi-objective optimization is a hot topic in ES algorithms. This paper proposes a MORL algorithm based on PI2 and multi-objective ES, which can handle the problem related to robot motion learning. The effectiveness is shown via numerical simulations.

引用

页码：1418 / 1423

页数：6

共 50 条

[21] A multi-objective deep reinforcement learning framework
Thanh Thi Nguyen
Ngoc Duy Nguyen
Vamplew, Peter
Nahavandi, Saeid
Dazeley, Richard
Lim, Chee Peng
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 96
[22] A Constrained Multi-Objective Reinforcement Learning Framework
Huang, Sandy H.
Abdolmaleki, Abbas
Vezzani, Giulia
Brakel, Philemon
Mankowitz, Daniel J.
Neunert, Michael
Bohez, Steven
Tassa, Yuval
Heess, Nicolas
Riedmiller, Martin
Hadsell, Raia
CONFERENCE ON ROBOT LEARNING, VOL 164, 2021, 164 : 883 - 893
[23] Multi-objective Reinforcement Learning for Responsive Grids
Julien Perez
Cécile Germain-Renaud
Balazs Kégl
Charles Loomis
Journal of Grid Computing, 2010, 8 : 473 - 492
[24] Pedestrian simulation as multi-objective reinforcement learning
Ravichandran, Naresh Balaji
Yang, Fangkai
Peters, Christopher
Lansner, Anders
Herman, Pawel
18TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (IVA'18), 2018, : 307 - 312
[25] Decomposition based Multi-Objective Evolutionary Algorithm in XCS for Multi-Objective Reinforcement Learning
Cheng, Xiu
Browne, Will N.
Zhang, Mengjie
2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 622 - 629
[26] Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning
Zhou, Ruida
Liu., Tao
Kalathil, Dileep
Kumar, P. R.
Tian, Chao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[27] Track Learning Agent Using Multi-objective Reinforcement Learning
Shah, Rushabh
Ruparel, Vidhi
Prabhu, Mukul
D'mello, Lynette
FOURTH CONGRESS ON INTELLIGENT SYSTEMS, VOL 1, CIS 2023, 2024, 868 : 27 - 40
[28] Learning Multi-Objective Curricula for Robotic Policy Learning
Kang, Jikun
Liu, Miao
Gupta, Abhinav
Pal, Christopher
Liu, Xue
Fu, Jie
CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 847 - 858
[29] A Multi-objective Reinforcement Learning Algorithm for JS']JSSP
Mendez-Hernandez, Beatriz M.
Rodriguez-Bazan, Erick D.
Martinez-Jimenez, Yailen
Libin, Pieter
Nowe, Ann
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: THEORETICAL NEURAL COMPUTATION, PT I, 2019, 11727 : 567 - 584
[30] Multi-Objective Service Composition Using Reinforcement Learning
Moustafa, Ahmed
Zhang, Minjie
SERVICE-ORIENTED COMPUTING, ICSOC 2013, 2013, 8274 : 298 - 312

← 1 2 3 4 5 →