Efficient Multi-Goal Reinforcement Learning via Value Consistency Prioritization

被引:0
|
作者
Xu, Jiawei [1 ]
Li, Shuxing [1 ]
Yang, Rui [2 ]
Yuan, Chun [1 ]
Han, Lei [3 ]
机构
[1] Tsinghua Shenzhen Int Grad Sch, Shenzhen, Guangdong, Peoples R China
[2] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[3] Tencent Robot X, Shenzhen, Guangdong, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Goal-conditioned reinforcement learning (RL) with sparse rewards remains a challeng-ing problem in deep RL. Hindsight Experience Replay (HER) has been demonstrated to be an effective solution, where HER replaces desired goals in failed experiences with practically achieved states. Existing approaches mainly focus on either exploration or exploitation to improve the performance of HER. From a joint perspective, exploiting specific past ex-periences can also implicitly drive exploration. Therefore, we concentrate on prioritizing both original and relabeled samples for efficient goal-conditioned RL. To achieve this, we propose a novel value consistency prioritization (VCP) method, where the priority of sam-ples is determined by the consistency of ensemble Q-values. This distinguishes the VCP method with most existing prioritization approaches which prioritizes samples based on the uncertainty of ensemble Q-values. Through extensive experiments, we demonstrate that VCP achieves significantly higher sample efficiency than existing algorithms on a range of challenging goal-conditioned manipulation tasks. We also visualize how VCP prioritizes good experiences to enhance policy learning.
引用
收藏
页码:355 / 376
页数:22
相关论文
共 50 条
  • [1] Efficient Multi-Goal Reinforcement Learning via Value Consistency Prioritization
    Xu J.
    Li S.
    Yang R.
    Yuan C.
    Han L.
    Journal of Artificial Intelligence Research, 2023, 77 : 355 - 376
  • [2] Multi-goal Reinforcement Learning via Exploring Successor Matching
    Feng, Xiaoyun
    2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 401 - 408
  • [3] Goal Density-based Hindsight Experience Prioritization for Multi-Goal Robot Manipulation Reinforcement Learning
    Kuang, Yingyi
    Weinberg, Abraham Itzhak
    Vogiatzis, George
    Faria, Diego R.
    2020 29TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2020, : 432 - 437
  • [4] Guided goal generation for hindsight multi-goal reinforcement learning
    Bai, Chenjia
    Liu, Peng
    Zhao, Wei
    Tang, Xianglong
    NEUROCOMPUTING, 2019, 359 : 353 - 367
  • [5] Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
    Zhao, Rui
    Sun, Xudong
    Tresp, Volker
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [6] Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning
    Castanet, Nicolas
    Sigaud, Olivier
    Lamprier, Sylvain
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [7] Combining Hindsight with Goal-enhanced Prediction for Multi-goal Reinforcement Learning
    Yang, Rui
    Luo, Feng
    Li, Xiu
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 314 - 321
  • [8] CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning
    Colas, Cedric
    Fournier, Pierre
    Sigaud, Olivier
    Chetouani, Mohamed
    Oudeyer, Pierre-Yves
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [9] Hierarchical reinforcement learning for handling sparse rewards in multi-goal navigation
    Yan, Jiangyue
    Luo, Biao
    Xu, Xiaodong
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (06)
  • [10] Scalable Multi-Robot Cooperation for Multi-Goal Tasks Using Reinforcement Learning
    An, Tianxu
    Lee, Joonho
    Bjelonic, Marko
    De Vincenti, Flavio
    Hutter, Marco
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (02): : 1585 - 1592