Multi-objective Reinforcement Learning with Path Integral Policy Improvement

被引:0
|
作者
Ariizumi, Ryo [1 ]
Sago, Hayato [2 ]
Asai, Toru [2 ]
Azuma, Shun-ichi
机构
[1] Tokyo Univ Agr & Technol, Dept Mech Syst Engn, Tokyo, Japan
[2] Nagoya Univ, Grad Sch Engn, Nagoya, Japan
关键词
Multi-objective reinforcement learning; policy improvement;
D O I
10.23919/SICE59929.2023.10354223
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-objective reinforcement learning (MORL) for robot motion learning is a challenging problem not only because of the scarcity of the data but also of the high-dimensional and continuous state and action spaces. Most existing MORL algorithms are inadequate in this regard. However, in single-objective reinforcement learning, policy-based algorithms have solved the problem of high-dimensional and continuous state and action spaces. Among such algorithms is policy improvement with path integral (PI2), which has been successful in robot motion learning. PI2 is similar to evolution strategies (ES), and multi-objective optimization is a hot topic in ES algorithms. This paper proposes a MORL algorithm based on PI2 and multi-objective ES, which can handle the problem related to robot motion learning. The effectiveness is shown via numerical simulations.
引用
收藏
页码:1418 / 1423
页数:6
相关论文
共 50 条
  • [21] A multi-objective deep reinforcement learning framework
    Thanh Thi Nguyen
    Ngoc Duy Nguyen
    Vamplew, Peter
    Nahavandi, Saeid
    Dazeley, Richard
    Lim, Chee Peng
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 96
  • [22] A Constrained Multi-Objective Reinforcement Learning Framework
    Huang, Sandy H.
    Abdolmaleki, Abbas
    Vezzani, Giulia
    Brakel, Philemon
    Mankowitz, Daniel J.
    Neunert, Michael
    Bohez, Steven
    Tassa, Yuval
    Heess, Nicolas
    Riedmiller, Martin
    Hadsell, Raia
    CONFERENCE ON ROBOT LEARNING, VOL 164, 2021, 164 : 883 - 893
  • [23] Multi-objective Reinforcement Learning for Responsive Grids
    Julien Perez
    Cécile Germain-Renaud
    Balazs Kégl
    Charles Loomis
    Journal of Grid Computing, 2010, 8 : 473 - 492
  • [24] Pedestrian simulation as multi-objective reinforcement learning
    Ravichandran, Naresh Balaji
    Yang, Fangkai
    Peters, Christopher
    Lansner, Anders
    Herman, Pawel
    18TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (IVA'18), 2018, : 307 - 312
  • [25] Decomposition based Multi-Objective Evolutionary Algorithm in XCS for Multi-Objective Reinforcement Learning
    Cheng, Xiu
    Browne, Will N.
    Zhang, Mengjie
    2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 622 - 629
  • [26] Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning
    Zhou, Ruida
    Liu., Tao
    Kalathil, Dileep
    Kumar, P. R.
    Tian, Chao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [27] Track Learning Agent Using Multi-objective Reinforcement Learning
    Shah, Rushabh
    Ruparel, Vidhi
    Prabhu, Mukul
    D'mello, Lynette
    FOURTH CONGRESS ON INTELLIGENT SYSTEMS, VOL 1, CIS 2023, 2024, 868 : 27 - 40
  • [28] Learning Multi-Objective Curricula for Robotic Policy Learning
    Kang, Jikun
    Liu, Miao
    Gupta, Abhinav
    Pal, Christopher
    Liu, Xue
    Fu, Jie
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 847 - 858
  • [29] A Multi-objective Reinforcement Learning Algorithm for JS']JSSP
    Mendez-Hernandez, Beatriz M.
    Rodriguez-Bazan, Erick D.
    Martinez-Jimenez, Yailen
    Libin, Pieter
    Nowe, Ann
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: THEORETICAL NEURAL COMPUTATION, PT I, 2019, 11727 : 567 - 584
  • [30] Multi-Objective Service Composition Using Reinforcement Learning
    Moustafa, Ahmed
    Zhang, Minjie
    SERVICE-ORIENTED COMPUTING, ICSOC 2013, 2013, 8274 : 298 - 312