Learning to Modulate pre-trained Models in RL

被引:0
|
作者
Schmied, Thomas [1 ,2 ]
Hofmarcher, Markus [3 ]
Paischer, Fabian [1 ,2 ]
Pascanu, Razvan [4 ,5 ]
Hochreiter, Sepp [1 ,2 ]
机构
[1] ELLIS Unit Linz, Linz, Austria
[2] Inst Machine Learning, LIT AI Lab, Linz, Austria
[3] Johannes Kepler Univ Linz, JKU LIT SAL eSPML Lab, Inst Machine Learning, Linz, Austria
[4] Google DeepMind, London, England
[5] UCL, London, England
基金
欧盟地平线“2020”;
关键词
NEURAL-NETWORKS; REINFORCEMENT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement Learning (RL) has been successful in various domains like robotics, game playing, and simulation. While RL agents have shown impressive capabilities in their specific tasks, they insufficiently adapt to new tasks. In supervised learning, this adaptation problem is addressed by large-scale pre-training followed by fine-tuning to new down-stream tasks. Recently, pre-training on multiple tasks has been gaining traction in RL. However, fine-tuning a pre-trained model often suffers from catastrophic forgetting. That is, the performance on the pre-training tasks deteriorates when fine-tuning on new tasks. To investigate the catastrophic forgetting phenomenon, we first jointly pre-train a model on datasets from two benchmark suites, namely Meta-World and DMControl. Then, we evaluate and compare a variety of fine-tuning methods prevalent in natural language processing, both in terms of performance on new tasks, and how well performance on pre-training tasks is retained. Our study shows that with most fine-tuning approaches, the performance on pre-training tasks deteriorates significantly. Therefore, we propose a novel method, Learning-to-Modulate (L2M), that avoids the degradation of learned skills by modulating the information flow of the frozen pre-trained model via a learnable modulation pool. Our method achieves state-of-the-art performance on the Continual-World benchmark, while retaining performance on the pre-training tasks. Finally, to aid future research in this area, we release a dataset encompassing 50 Meta-World and 16 DMControl tasks.
引用
收藏
页数:35
相关论文
共 50 条
  • [1] Towards Inadequately Pre-trained Models in Transfer Learning
    Deng, Andong
    Li, Xingjian
    Hu, Di
    Wang, Tianyang
    Xiong, Haoyi
    Xu, Cheng-Zhong
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19340 - 19351
  • [2] Instabilities of Offline RL with Pre-Trained Neural Representation
    Wang, Ruosong
    Wu, Yifan
    Salakhutdinov, Ruslan
    Kakade, Sham M.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [3] RanPAC: Random Projections and Pre-trained Models for Continual Learning
    McDonnell, Mark D.
    Gong, Dong
    Parveneh, Amin
    Abbasnejad, Ehsan
    van den Hengel, Anton
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] CODEEDITOR: Learning to Edit Source Code with Pre-trained Models
    Li, Jia
    Li, Ge
    Li, Zhuo
    Jin, Zhi
    Hu, Xing
    Zhang, Kechi
    Fu, Zhiyi
    [J]. ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (06)
  • [5] Collaborative Learning across Heterogeneous Systems with Pre-Trained Models
    Hoang, Trong Nghia
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22668 - 22668
  • [6] Meta Distant Transfer Learning for Pre-trained Language Models
    Wang, Chengyu
    Pan, Haojie
    Qiu, Minghui
    Yang, Fei
    Huang, Jun
    Zhang, Yin
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9742 - 9752
  • [7] MODEL SPIDER: Learning to Rank Pre-Trained Models Efficiently
    Zhang, Yi-Kai
    Huang, Ting-Ji
    Ding, Yao-Xiang
    Zhan, De-Chuan
    Ye, Han-Jia
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] Do Pre-trained Models Benefit Equally in Continual Learning?
    Lee, Kuan-Ying
    Zhong, Yuanyi
    Wang, Yu-Xiong
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6474 - 6482
  • [9] Class-Incremental Learning with Strong Pre-trained Models
    Wu, Tz-Ying
    Swaminathan, Gurumurthy
    Li, Zhizhong
    Ravichandran, Avinash
    Vasconcelos, Nuno
    Bhotika, Rahul
    Soatto, Stefano
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9591 - 9600
  • [10] LogME: Practical Assessment of Pre-trained Models for Transfer Learning
    You, Kaichao
    Liu, Yong
    Wang, Jianmin
    Long, Mingsheng
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139