Pretraining Representations for Data-Efficient Reinforcement Learning

被引:0
|
作者
Schwarzer, Max [1 ,2 ]
Rajkumar, Nitarshan [1 ,2 ]
Noukhovitch, Michael [1 ,2 ]
Anand, Ankesh [1 ,2 ]
Charlin, Laurent [1 ,3 ,5 ]
Hjelm, Devon [1 ,4 ]
Bachman, Philip [4 ]
Courville, Aaron [1 ,2 ,5 ]
机构
[1] Mila, Montreal, PQ, Canada
[2] Univ Montreal, Montreal, PQ, Canada
[3] HEC Montreal, Montreal, PQ, Canada
[4] Microsoft Res, Montreal, PQ, Canada
[5] CIFAR, Toronto, ON, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data efficiency is a key challenge for deep reinforcement learning. We address this problem by using unlabeled data to pretrain an encoder which is then finetuned on a small amount of task-specific data. To encourage learning representations which capture diverse aspects of the underlying MDP, we employ a combination of latent dynamics modelling and unsupervised goal-conditioned RL. When limited to 100k steps of interaction on Atari games (equivalent to two hours of human experience), our approach significantly surpasses prior work combining offline representation pretraining with task-specific finetuning, and compares favourably with other pretraining methods that require orders of magnitude more data. Our approach shows particular promise when combined with larger models as well as more diverse, task-aligned observational data - approaching human-level performance and data-efficiency on Atari in our best setting. We provide code associated with this work at https://github.com/mila-iqia/SGI.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] EqR: Equivariant Representations for Data-Efficient Reinforcement Learning
    Mondal, Arnab Kumar
    Jain, Vineet
    Siddiqi, Kaleem
    Ravanbakhsh, Siamak
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [2] Data-Efficient Hierarchical Reinforcement Learning
    Nachum, Ofir
    Gu, Shixiang
    Lee, Honglak
    Levine, Sergey
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [3] Mix-up Consistent Cross Representations for Data-Efficient Reinforcement Learning
    Liu, Shiyu
    Cao, Guitao
    Liu, Yong
    Li, Yan
    Wu, Chunwei
    Xi, Xidong
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [4] Data-Efficient Reinforcement Learning for Malaria Control
    Zou, Lixin
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 507 - 513
  • [5] Data-Efficient Reinforcement Learning for Variable Impedance Control
    Anand, Akhil S.
    Kaushik, Rituraj
    Gravdahl, Jan Tommy
    Abu-Dakka, Fares J.
    [J]. IEEE ACCESS, 2024, 12 : 15631 - 15641
  • [6] BarlowRL: Barlow Twins for Data-Efficient Reinforcement Learning
    Cagatan, Omer Veysel
    Akgun, Baris
    [J]. ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
  • [7] Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data
    Nie, Allen
    Flet-Berliac, Yannis
    Jordan, Deon R.
    Steenbergen, William
    Brunskill, Emma
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] Data-Efficient Offline Reinforcement Learning with Approximate Symmetries
    Angelotti, Giorgio
    Drougard, Nicolas
    Chanel, Caroline P. C.
    [J]. AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2023, 2024, 14546 : 164 - 186
  • [9] Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning
    Zhao, Dongfang
    Liu, Jiafeng
    Wu, Rui
    Cheng, Dansong
    Tang, Xianglong
    [J]. IEEE ACCESS, 2019, 7 : 55763 - 55769
  • [10] Concurrent Credit Assignment for Data-efficient Reinforcement Learning
    Dauce, Emmanuel
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,