Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations

被引:0
|
作者
Zhou, Renzhe
Gao, Chen-Xiao
Zhang, Zongzhang [1 ]
Yu, Yang
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
基金
国家重点研发计划; 美国国家科学基金会;
关键词
CONTEXT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generalization and sample efficiency have been longstanding issues concerning reinforcement learning, and thus the field of Offline Meta-Reinforcement Learning (OMRL) has gained increasing attention due to its potential of solving a wide range of problems with static and limited offline data. Existing OMRL methods often assume sufficient training tasks and data coverage to apply contrastive learning to extract task representations. However, such assumptions are not applicable in several real-world applications and thus undermine the generalization ability of the representations. In this paper, we consider OMRL with two types of data limitations: limited training tasks and limited behavior diversity and propose a novel algorithm called GENTLE for learning generalizable task representations in the face of data limitations. GENTLE employs Task Auto-Encoder (TAE), which is an encoder-decoder architecture to extract the characteristics of the tasks. Unlike existing methods, TAE is optimized solely by reconstruction of the state transition and reward, which captures the generative structure of the task models and produces generalizable representations when training tasks are limited. To alleviate the effect of limited behavior diversity, we consistently construct pseudo-transitions to align the data distribution used to train TAE with the data distribution encountered during testing. Empirically, GENTLE significantly outperforms existing OMRL methods on both indistribution tasks and out-of-distribution tasks across both the given-context protocol and the one-shot protocol.
引用
收藏
页码:17132 / 17140
页数:9
相关论文
共 50 条
  • [21] Adaptable Image Quality Assessment Using Meta-Reinforcement Learning of Task Amenability
    Saeed, Shaheer U.
    Fu, Yunguan
    Stavrinides, Vasilis
    Baum, Zachary M. C.
    Yang, Qianye
    Rusu, Mirabela
    Fan, Richard E.
    Sonn, Geoffrey A.
    Noble, J. Alison
    Barratt, Dean C.
    Hu, Yipeng
    SIMPLIFYING MEDICAL ULTRASOUND, 2021, 12967 : 191 - 201
  • [22] Dynamic Task Offloading Scheme for Edge Computing via Meta-Reinforcement Learning
    Liu, Jiajia
    Xie, Peng
    Li, Wei
    Tang, Bo
    Liu, Jianhua
    CMC-COMPUTERS MATERIALS & CONTINUA, 2025, 82 (02): : 2609 - 2635
  • [23] A Review of Offline Reinforcement Learning Based on Representation Learning
    Wang X.-S.
    Wang R.-R.
    Cheng Y.-H.
    Zidonghua Xuebao/Acta Automatica Sinica, 2024, 50 (06): : 1104 - 1128
  • [24] Some Considerations on Learning to Explore via Meta-Reinforcement Learning
    Stadie, Bradly C.
    Yang, Ge
    Houthooft, Rein
    Chen, Xi
    Duan, Yan
    Wu, Yuhuai
    Abbeel, Pieter
    Sutskever, Ilya
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [25] Prefrontal cortex as a meta-reinforcement learning system
    Jane X. Wang
    Zeb Kurth-Nelson
    Dharshan Kumaran
    Dhruva Tirumala
    Hubert Soyer
    Joel Z. Leibo
    Demis Hassabis
    Matthew Botvinick
    Nature Neuroscience, 2018, 21 : 860 - 868
  • [26] A Meta-Reinforcement Learning Approach to Process Control
    McClement, Daniel G.
    Lawrence, Nathan P.
    Loewen, Philip D.
    Forbes, Michael G.
    Backstrom, Johan U.
    Gopaluni, R. Bhushan
    IFAC PAPERSONLINE, 2021, 54 (03): : 685 - 692
  • [27] Unsupervised Curricula for Visual Meta-Reinforcement Learning
    Jabri, Allan
    Hsu, Kyle
    Eysenbach, Benjamin
    Gupta, Abhishek
    Levine, Sergey
    Finn, Chelsea
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [28] Meta-Reinforcement Learning of Structured Exploration Strategies
    Gupta, Abhishek
    Mendonca, Russell
    Liu, YuXuan
    Abbeel, Pieter
    Levine, Sergey
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [29] Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation
    Hu, Hangkai
    Huang, Gao
    Li, Xiang
    Song, Shiji
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (03) : 1454 - 1464
  • [30] A Meta-Reinforcement Learning Algorithm for Causal Discovery
    Sauter, Andreas
    Acar, Erman
    Francois-Lavet, Vincent
    CONFERENCE ON CAUSAL LEARNING AND REASONING, VOL 213, 2023, 213 : 602 - 619