Efficient Multi-Goal Reinforcement Learning via Value Consistency Prioritization

被引：0

作者：

Xu J. ^{[1
]}

Li S. ^{[1
]}

Yang R. ^{[2
]}

Yuan C. ^{[1
]}

Han L. ^{[3
]}

机构：

[1] Tsinghua Shenzhen International Graduate School, Guangdong, Shenzhen

[2] The Hong Kong University of Science and Technology, Hong Kong

[3] Tencent Robotics X Shenzhen, Guangdong

来源：

Journal of Artificial Intelligence Research | 2023年 / 77卷

关键词：

All Open Access; Gold;

D O I：

10.1613/jair.1.14398

中图分类号：

学科分类号：

摘要：

Goal-conditioned reinforcement learning (RL) with sparse rewards remains a challenging problem in deep RL. Hindsight Experience Replay (HER) has been demonstrated to be an effective solution, where HER replaces desired goals in failed experiences with practically achieved states. Existing approaches mainly focus on either exploration or exploitation to improve the performance of HER. From a joint perspective, exploiting specific past experiences can also implicitly drive exploration. Therefore, we concentrate on prioritizing both original and relabeled samples for efficient goal-conditioned RL. To achieve this, we propose a novel value consistency prioritization (VCP) method, where the priority of samples is determined by the consistency of ensemble Q-values. This distinguishes the VCP method with most existing prioritization approaches which prioritizes samples based on the uncertainty of ensemble Q-values. Through extensive experiments, we demonstrate that VCP achieves significantly higher sample efficiency than existing algorithms on a range of challenging goal-conditioned manipulation tasks. We also visualize how VCP prioritizes good experiences to enhance policy learning. © 2023 AI Access Foundation. All rights reserved.

引用

页码：355 / 376

页数：21

共 50 条

[11] A Multi-Goal Particle Swarm Optimizer for Test Case Prioritization
Nazir, Muhammad
Mehmood, Arif
Aslam, Waqar
Park, Yongwan
Choi, Gyu Sang
Ashraf, Imran
IEEE ACCESS, 2023, 11 : 90683 - 90697
[12] Improving multi-goal and target-driven reinforcement learning with supervised auxiliary task
Horita, Luiz R. T.
Nakamura, Angelica T. M.
Wolf, Denis F.
Grassi Junior, Valdir
2021 20TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS (ICAR), 2021, : 290 - 295
[13] Biologically inspired reinforcement learning: Reward-based decomposition for multi-goal environments
Zhou, WD
Coggins, R
BIOLOGICALLY INSPIRED APPROACHES TO ADVANCED INFORMATION TECHNOLOGY, 2004, 3141 : 80 - 94
[14] Overfitting-avoiding goal-guided exploration for hard-exploration multi-goal reinforcement learning
Han, Changlin
Peng, Zhiyong
Liu, Yadong
Tang, Jingsheng
Yu, Yang
Zhou, Zongtan
NEUROCOMPUTING, 2023, 525 : 76 - 87
[15] Learning Multi-Goal Dialogue Strategies Using Reinforcement Learning With Reduced State-Action Spaces
Cuayahuitl, Heriberto
Renals, Steve
Lemon, Oliver
Shimodaira, Hiroshi
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 469 - +
[16] Reinforcement Learning Control Based on Multi-Goal Representation Using Hierarchical Heuristic Dynamic Programming
Ni, Zhen
He, Haibo
Zhao, Dongbin
Prokhorov, Danil V.
2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
[17] Continuous Value Iteration (CVI) Reinforcement Learning and Imaginary Experience Replay (IER) for learning multi-goal, continuous action and state space controllers
Gerken, Andreas
Spranger, Michael
2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 7173 - 7179
[18] Two-stage visual navigation by deep neural networks and multi-goal reinforcement learning
Shantia, Amirhossein
Timmers, Rik
Chong, Yiebo
Kuiper, Cornel
Bidoia, Francesco
Schomaker, Lambert
Wiering, Marco
ROBOTICS AND AUTONOMOUS SYSTEMS, 2021, 138
[19] Multi-goal Q-learning of cooperative teams
Li, Jing
Sheng, Zhaohan
Ng, KwanChew
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (03) : 1565 - 1574
[20] Reward-weighted DHER Mechanism For Multi-goal Reinforcement Learning With Application To Robotic Manipulation Control
Wei, Xueyu
Duan, Lilong
Xue, Wei
JOURNAL OF APPLIED SCIENCE AND ENGINEERING, 2023, 26 (12): : 1829 - 1841

← 1 2 3 4 5 →