Pretraining Representations for Data-Efficient Reinforcement Learning

被引：0

作者：

Schwarzer, Max ^{[1
,2
]}

Rajkumar, Nitarshan ^{[1
,2
]}

Noukhovitch, Michael ^{[1
,2
]}

Anand, Ankesh ^{[1
,2
]}

Charlin, Laurent ^{[1
,3
,5
]}

Hjelm, Devon ^{[1
,4
]}

Bachman, Philip ^{[4
]}

Courville, Aaron ^{[1
,2
,5
]}

机构：

[1] Mila, Montreal, PQ, Canada

[2] Univ Montreal, Montreal, PQ, Canada

[3] HEC Montreal, Montreal, PQ, Canada

[4] Microsoft Res, Montreal, PQ, Canada

[5] CIFAR, Toronto, ON, Canada

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Data efficiency is a key challenge for deep reinforcement learning. We address this problem by using unlabeled data to pretrain an encoder which is then finetuned on a small amount of task-specific data. To encourage learning representations which capture diverse aspects of the underlying MDP, we employ a combination of latent dynamics modelling and unsupervised goal-conditioned RL. When limited to 100k steps of interaction on Atari games (equivalent to two hours of human experience), our approach significantly surpasses prior work combining offline representation pretraining with task-specific finetuning, and compares favourably with other pretraining methods that require orders of magnitude more data. Our approach shows particular promise when combined with larger models as well as more diverse, task-aligned observational data - approaching human-level performance and data-efficiency on Atari in our best setting. We provide code associated with this work at https://github.com/mila-iqia/SGI.

引用

页数：14

共 50 条

[1] EqR: Equivariant Representations for Data-Efficient Reinforcement Learning
Mondal, Arnab Kumar
Jain, Vineet
Siddiqi, Kaleem
Ravanbakhsh, Siamak
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[2] Data-Efficient Hierarchical Reinforcement Learning
Nachum, Ofir
Gu, Shixiang
Lee, Honglak
Levine, Sergey
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[3] Mix-up Consistent Cross Representations for Data-Efficient Reinforcement Learning
Liu, Shiyu
Cao, Guitao
Liu, Yong
Li, Yan
Wu, Chunwei
Xi, Xidong
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[4] Data-Efficient Reinforcement Learning for Malaria Control
Zou, Lixin
[J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 507 - 513
[5] Data-Efficient Reinforcement Learning for Variable Impedance Control
Anand, Akhil S.
Kaushik, Rituraj
Gravdahl, Jan Tommy
Abu-Dakka, Fares J.
[J]. IEEE ACCESS, 2024, 12 : 15631 - 15641
[6] BarlowRL: Barlow Twins for Data-Efficient Reinforcement Learning
Cagatan, Omer Veysel
Akgun, Baris
[J]. ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
[7] Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data
Nie, Allen
Flet-Berliac, Yannis
Jordan, Deon R.
Steenbergen, William
Brunskill, Emma
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[8] Data-Efficient Offline Reinforcement Learning with Approximate Symmetries
Angelotti, Giorgio
Drougard, Nicolas
Chanel, Caroline P. C.
[J]. AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2023, 2024, 14546 : 164 - 186
[9] Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning
Zhao, Dongfang
Liu, Jiafeng
Wu, Rui
Cheng, Dansong
Tang, Xianglong
[J]. IEEE ACCESS, 2019, 7 : 55763 - 55769
[10] Concurrent Credit Assignment for Data-efficient Reinforcement Learning
Dauce, Emmanuel
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,

← 1 2 3 4 5 →