Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning

被引:0
|
作者
Yue, Yang [1 ,2 ]
Kang, Bingyi [2 ]
Xu, Zhongwen [2 ]
Huang, Gao [1 ]
Yan, Shuicheng [2 ]
机构
[1] Tsinghua Univ, Dept Automat, BNRist, Beijing, Peoples R China
[2] Sea AI Lab, Singapore, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning (RL) algorithms suffer severe performance degradation when the interaction data are scarce, which limits their real-world application. Recently, visual representation learning has been shown to be effective and promising for boosting sample efficiency in RL. These methods usually rely on contrastive learning and data augmentation to train a transition model for state prediction, which is different from how the model is used in RL-performing valuebased planning. Accordingly, the learned representation by these visual methods may be good for recognition but not optimal for estimating state value and solving the decision problem. To address this issue, we propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making. More specifically, VCR trains a model to predict the future state (also referred to as the "imagined state") based on the current one and a sequence of actions. Instead of aligning this imagined state with a real state returned by the environment, VCR applies a Q-value head on both states and obtains two distributions of action values. Then a distance is computed and minimized to force the imagined state to produce a similar action value prediction as that by the real state. We develop two implementations of the above idea for the discrete and continuous action spaces respectively. We conduct experiments on Atari 100K and DeepMind Control Suite benchmarks to validate their effectiveness in improving sample efficiency. It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
引用
收藏
页码:11069 / 11077
页数:9
相关论文
共 50 条
  • [1] Data-Efficient Hierarchical Reinforcement Learning
    Nachum, Ofir
    Gu, Shixiang
    Lee, Honglak
    Levine, Sergey
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [2] Mix-up Consistent Cross Representations for Data-Efficient Reinforcement Learning
    Liu, Shiyu
    Cao, Guitao
    Liu, Yong
    Li, Yan
    Wu, Chunwei
    Xi, Xidong
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [3] Data-Efficient Reinforcement Learning for Malaria Control
    Zou, Lixin
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 507 - 513
  • [4] Pretraining Representations for Data-Efficient Reinforcement Learning
    Schwarzer, Max
    Rajkumar, Nitarshan
    Noukhovitch, Michael
    Anand, Ankesh
    Charlin, Laurent
    Hjelm, Devon
    Bachman, Philip
    Courville, Aaron
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] EqR: Equivariant Representations for Data-Efficient Reinforcement Learning
    Mondal, Arnab Kumar
    Jain, Vineet
    Siddiqi, Kaleem
    Ravanbakhsh, Siamak
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [6] Data-Efficient Offline Reinforcement Learning with Approximate Symmetries
    Angelotti, Giorgio
    Drougard, Nicolas
    Chanel, Caroline P. C.
    [J]. AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2023, 2024, 14546 : 164 - 186
  • [7] Data-Efficient Reinforcement Learning for Variable Impedance Control
    Anand, Akhil S.
    Kaushik, Rituraj
    Gravdahl, Jan Tommy
    Abu-Dakka, Fares J.
    [J]. IEEE ACCESS, 2024, 12 : 15631 - 15641
  • [8] BarlowRL: Barlow Twins for Data-Efficient Reinforcement Learning
    Cagatan, Omer Veysel
    Akgun, Baris
    [J]. ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
  • [9] Concurrent Credit Assignment for Data-efficient Reinforcement Learning
    Dauce, Emmanuel
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [10] Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning
    Zhao, Dongfang
    Liu, Jiafeng
    Wu, Rui
    Cheng, Dansong
    Tang, Xianglong
    [J]. IEEE ACCESS, 2019, 7 : 55763 - 55769