SCORE: Simple Contrastive Representation and Reset-Ensemble for offline meta-reinforcement learning

被引:0
|
作者
Yang, Hanjie [1 ]
Lin, Kai [1 ]
Yang, Tao [1 ]
Sun, Guohan [1 ]
机构
[1] Dalian Univ Technol, Dept Comp Sci & Technol, Dalian, Peoples R China
关键词
Offline meta reinforcement learning; Contrastive learning; Reset-Ensemble;
D O I
10.1016/j.knosys.2024.112767
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Offline meta-reinforcement learning (OMRL) aims to train agents to quickly adapt to new tasks using only pre-collected data. However, existing OMRL methods often involve numerous ineffective training iterations and may experience performance collapse in the later stages of training. We identify the root cause-shallow memorization problem, where agents overspecialize in specific solutions for encountered states, hindering their generalization performance. This issue arises due to the loss of plasticity and the premature fitting of neural networks, which restricts the exploration of the agents. To address this challenge, we propose S imple CO ntrastive Representation and R eset-Ensemble for OMRL (SCORE), a novel context-based OMRL approach. SCORE introduces an end-to-end contrastive learning framework without negative samples to pre-train a context encoder, enabling more robust task representations. Subsequently, the context encoder is fine-tuned during meta-training. Furthermore, SCORE employs a Reset-Ensemble mechanism that periodically resets and ensembles partial networks to maintain the agents' continual learning ability and enhance their perception of characteristics across diverse tasks. Extensive experiments demonstrate that our SCORE method effectively avoids premature fitting and exhibits excellent generalization performance.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Masked Contrastive Representation Learning for Reinforcement Learning
    Zhu, Jinhua
    Xia, Yingce
    Wu, Lijun
    Deng, Jiajun
    Zhou, Wengang
    Qin, Tao
    Liu, Tie-Yan
    Li, Houqiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3421 - 3433
  • [22] Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning
    Wu, Zheng
    Xie, Yichen
    Lian, Wenzhao
    Wang, Changhao
    Guo, Yanjiang
    Chen, Jianyu
    Schaal, Stefan
    Tomizuka, Masayoshi
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 7169 - 7175
  • [23] A Meta-Reinforcement Learning Algorithm for Causal Discovery
    Sauter, Andreas
    Acar, Erman
    Francois-Lavet, Vincent
    CONFERENCE ON CAUSAL LEARNING AND REASONING, VOL 213, 2023, 213 : 602 - 619
  • [24] Formalising Performance Guarantees in Meta-Reinforcement Learning
    Mahony, Amanda
    FORMAL METHODS AND SOFTWARE ENGINEERING, ICFEM 2018, 2018, 11232 : 469 - 472
  • [25] Meta-reinforcement learning via orbitofrontal cortex
    Hattori, Ryoma
    Hedrick, Nathan G.
    Jain, Anant
    Chen, Shuqi
    You, Hanjia
    Hattori, Mariko
    Choi, Jun-Hyeok
    Lim, Byung Kook
    Yasuda, Ryohei
    Komiyama, Takaki
    NATURE NEUROSCIENCE, 2023, 26 (12) : 2182 - 2191
  • [26] Context meta-reinforcement learning via neuromodulation
    Ben-Iwhiwhu, Eseoghene
    Dick, Jeffery
    Ketz, Nicholas A.
    Pilly, Praveen K.
    Soltoggio, Andrea
    NEURAL NETWORKS, 2022, 152 : 70 - 79
  • [27] Meta-Reinforcement Learning via Language Instructions
    Bing, Zhenshan
    Koch, Alexander
    Yao, Xiangtong
    Huang, Kai
    Knoll, Alois
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 5985 - 5991
  • [28] Meta-Reinforcement Learning in Nonstationary and Nonparametric Environments
    Bing, Zhenshan
    Knak, Lukas
    Cheng, Long
    Morin, Fabrice O.
    Huang, Kai
    Knoll, Alois
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 13604 - 13618
  • [29] Prefrontal cortex as a meta-reinforcement learning system
    Wang, Jane X.
    Kurth-Nelson, Zeb
    Kumaran, Dharshan
    Tirumala, Dhruva
    Soyer, Hubert
    Leibo, Joel Z.
    Hassabis, Demis
    Botvinick, Matthew
    NATURE NEUROSCIENCE, 2018, 21 (06) : 860 - +
  • [30] Some Considerations on Learning to Explore via Meta-Reinforcement Learning
    Stadie, Bradly C.
    Yang, Ge
    Houthooft, Rein
    Chen, Xi
    Duan, Yan
    Wu, Yuhuai
    Abbeel, Pieter
    Sutskever, Ilya
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31