Sample-efficient Reinforcement Learning Representation Learning with Curiosity Contrastive Forward Dynamics Model

被引:9
|
作者
Nguyen, Thanh [1 ]
Luu, Tung M. [1 ]
Vu, Thang [1 ]
Yoo, Chang D. [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Fac Elect Engn, Daejeon 34141, South Korea
关键词
LEVEL; GO;
D O I
10.1109/IROS51168.2021.9636536
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Developing an agent in reinforcement learning (RL) that is capable of performing complex control tasks directly from high-dimensional observation such as raw pixels is a challenge as efforts still need to be made towards improving sample efficiency and generalization of RL algorithm. This paper considers a learning framework for a Curiosity Contrastive Forward Dynamics Model (CCFDM) to achieve a more sample-efficient RL based directly on raw pixels. CCFDM incorporates a forward dynamics model (FDM) and performs contrastive learning to train its deep convolutional neural network-based image encoder (IE) to extract conducive spatial and temporal information to achieve a more sample efficiency for RL. In addition, during training, CCFDM provides intrinsic rewards, produced based on FDM prediction error, and encourages the curiosity of the RL agent to improve exploration. The diverge and less-repetitive observations provided by both our exploration strategy and data augmentation available in contrastive learning improve not only the sample efficiency but also the generalization. Performance of existing model-free RL methods such as Soft Actor-Critic built on top of CCFDM outperforms prior state-of-the-art pixel-based RL methods on the DeepMind Control Suite benchmark.
引用
收藏
页码:3471 / 3477
页数:7
相关论文
共 50 条
  • [1] Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
    Jin, Chi
    Kakade, Sham M.
    Krishnamurthy, Akshay
    Liu, Qinghua
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [2] Sample-efficient Real-time Planning with Curiosity Cross-Entropy Method and Contrastive Learning
    Kotb, Mostafa
    Weber, Cornelius
    Wermter, Stefan
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 9456 - 9463
  • [3] Sample-efficient model-based reinforcement learning for quantum control
    Khalid, Irtaza
    Weidner, Carrie A.
    Jonckheere, Edmond A.
    Schirmer, Sophie G.
    Langbein, Frank C.
    PHYSICAL REVIEW RESEARCH, 2023, 5 (04):
  • [4] Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
    Metcalf, Katherine
    Sarabia, Miguel
    Mackraz, Natalie
    Theobald, Barry-John
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [5] Sample-Efficient Multimodal Dynamics Modeling for Risk-Sensitive Reinforcement Learning
    Yashima, Ryota
    Yamaguchi, Akihiko
    Hashimoto, Koichi
    2022 8th International Conference on Mechatronics and Robotics Engineering, ICMRE 2022, 2022, : 21 - 27
  • [6] Sample-efficient Reinforcement Learning in Robotic Table Tennis
    Tebbe, Jonas
    Krauch, Lukas
    Gao, Yapeng
    Zell, Andreas
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4171 - 4178
  • [7] Sample-Efficient Multimodal Dynamics Modeling for Risk-Sensitive Reinforcement Learning
    Yashima, Ryota
    Yamaguchi, Akihiko
    Hashimoto, Koichi
    2022 8TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND ROBOTICS ENGINEERING (ICMRE 2022), 2022, : 21 - 27
  • [8] Sample-efficient reinforcement learning for CERN accelerator control
    Kain, Verena
    Hirlander, Simon
    Goddard, Brennan
    Velotti, Francesco Maria
    Porta, Giovanni Zevi Della
    Bruchon, Niky
    Valentino, Gianluca
    PHYSICAL REVIEW ACCELERATORS AND BEAMS, 2020, 23 (12)
  • [9] A New Sample-Efficient PAC Reinforcement Learning Algorithm
    Zehfroosh, Ashkan
    Tanner, Herbert G.
    2020 28TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2020, : 788 - 793
  • [10] Conditional Abstraction Trees for Sample-Efficient Reinforcement Learning
    Dadvar, Mehdi
    Nayyar, Rashmeet Kaur
    Srivastava, Siddharth
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 485 - 495