Meta-Reinforcement Learning in Nonstationary and Nonparametric Environments

被引:8
|
作者
Bing, Zhenshan [1 ]
Knak, Lukas [1 ]
Cheng, Long [2 ]
Morin, Fabrice O. [1 ]
Huang, Kai [3 ]
Knoll, Alois [1 ]
机构
[1] Tech Univ Munich, Dept Informat, D-85748 Munich, Germany
[2] Wenzhou Univ, Coll Comp Sci & Artificial Intelligence, Wenzhou 325035, Peoples R China
[3] Sun Yat sen Univ, Sch Data & Comp Sci, Guangzhou 543000, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Training; Adaptation models; Robots; Probabilistic logic; Turning; Switches; Gaussian variational autoencoder (VAE); meta-reinforcement learning (meta-RL); robotic control; task adaptation; task inference;
D O I
10.1109/TNNLS.2023.3270298
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent state-of-the-art artificial agents lack the ability to adapt rapidly to new tasks, as they are trained exclusively for specific objectives and require massive amounts of interaction to learn new skills. Meta-reinforcement learning (meta-RL) addresses this challenge by leveraging knowledge learned from training tasks to perform well in previously unseen tasks. However, current meta-RL approaches limit themselves to narrow parametric and stationary task distributions, ignoring qualitative differences and nonstationary changes between tasks that occur in the real world. In this article, we introduce a Task-Inference-based meta-RL algorithm using explicitly parameterized Gaussian variational autoencoders (VAEs) and gated Recurrent units (TIGR), designed for nonparametric and nonstationary environments. We employ a generative model involving a VAE to capture the multimodality of the tasks. We decouple the policy training from the task-inference learning and efficiently train the inference mechanism on the basis of an unsupervised reconstruction objective. We establish a zero-shot adaptation procedure to enable the agent to adapt to nonstationary task changes. We provide a benchmark with qualitatively distinct tasks based on the half-cheetah environment and demonstrate the superior performance of TIGR compared with state-of-the-art meta-RL approaches in terms of sample efficiency (three to ten times faster), asymptotic performance, and applicability in nonparametric and nonstationary environments with zero-shot adaptation. Videos can be viewed at https://videoviewsite.wixsite.com/tigr.
引用
收藏
页码:13604 / 13618
页数:15
相关论文
共 50 条
  • [31] PAC-Bayesian offline Meta-reinforcement learning
    Zheng Sun
    Chenheng Jing
    Shangqi Guo
    Lingling An
    Applied Intelligence, 2023, 53 : 27128 - 27147
  • [32] Taming MAML: Efficient Unbiased Meta-Reinforcement Learning
    Liu, Hao
    Socher, Richard
    Xiong, Caiming
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [33] A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning
    Liu, Bo
    Feng, Xidong
    Ren, Jie
    Mai, Luo
    Zhu, Rui
    Zhang, Haifeng
    Wang, Jun
    Yang, Yaodong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [34] Context Shift Reduction for Offline Meta-Reinforcement Learning
    Gao, Yunkai
    Zhang, Rui
    Guo, Jiaming
    Wu, Fan
    Yi, Qi
    Peng, Shaohui
    Lan, Siming
    Chen, Ruizhi
    Du, Zidong
    Hu, Xing
    Guo, Qi
    Li, Ling
    Chen, Yunji
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [35] GlobalLocal Decomposition of Contextual Representations in Meta-Reinforcement Learning
    Ma, Nelson
    Xuan, Junyu
    Zhang, Guangquan
    Lu, Jie
    IEEE TRANSACTIONS ON CYBERNETICS, 2025, 55 (03) : 1277 - 1287
  • [36] Meta-Reinforcement Learning for Mastering Multiple Skills and Generalizing across Environments in Text-based Games
    Zhao, Zhenjie
    Sun, Mingfei
    Ma, Xiaojuan
    1ST WORKSHOP ON META LEARNING AND ITS APPLICATIONS TO NATURAL LANGUAGE PROCESSING (METANLP 2021), 2021, : 1 - 10
  • [37] A Federated Meta-Reinforcement Learning Algorithm Based on Gradient Correction
    Qin, Zerui
    Yue, Sheng
    PROCEEDINGS OF THE ACM TURING AWARD CELEBRATION CONFERENCE-CHINA 2024, ACM-TURC 2024, 2024, : 220 - 221
  • [38] Harnessing Meta-Reinforcement Learning for Enhanced Tracking in Geofencing Systems
    Famili, Alireza
    Sun, Shihua
    Atalay, Tolga
    Stavrou, Angelos
    IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY, 2025, 6 : 944 - 960
  • [39] On First-Order Meta-Reinforcement Learning with Moreau Envelopes
    Toghani, Mohammad Taha
    Perez-Salazar, Sebastian
    Uribe, Cesar A.
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 4176 - 4181
  • [40] Meta-Reinforcement Learning by Tracking Task Non-stationarity
    Poiani, Riccardo
    Tirinzoni, Andrea
    Restelli, Marcello
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2899 - 2905