Meta-Reinforcement Learning in Nonstationary and Nonparametric Environments

被引:8
|
作者
Bing, Zhenshan [1 ]
Knak, Lukas [1 ]
Cheng, Long [2 ]
Morin, Fabrice O. [1 ]
Huang, Kai [3 ]
Knoll, Alois [1 ]
机构
[1] Tech Univ Munich, Dept Informat, D-85748 Munich, Germany
[2] Wenzhou Univ, Coll Comp Sci & Artificial Intelligence, Wenzhou 325035, Peoples R China
[3] Sun Yat sen Univ, Sch Data & Comp Sci, Guangzhou 543000, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Training; Adaptation models; Robots; Probabilistic logic; Turning; Switches; Gaussian variational autoencoder (VAE); meta-reinforcement learning (meta-RL); robotic control; task adaptation; task inference;
D O I
10.1109/TNNLS.2023.3270298
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent state-of-the-art artificial agents lack the ability to adapt rapidly to new tasks, as they are trained exclusively for specific objectives and require massive amounts of interaction to learn new skills. Meta-reinforcement learning (meta-RL) addresses this challenge by leveraging knowledge learned from training tasks to perform well in previously unseen tasks. However, current meta-RL approaches limit themselves to narrow parametric and stationary task distributions, ignoring qualitative differences and nonstationary changes between tasks that occur in the real world. In this article, we introduce a Task-Inference-based meta-RL algorithm using explicitly parameterized Gaussian variational autoencoders (VAEs) and gated Recurrent units (TIGR), designed for nonparametric and nonstationary environments. We employ a generative model involving a VAE to capture the multimodality of the tasks. We decouple the policy training from the task-inference learning and efficiently train the inference mechanism on the basis of an unsupervised reconstruction objective. We establish a zero-shot adaptation procedure to enable the agent to adapt to nonstationary task changes. We provide a benchmark with qualitatively distinct tasks based on the half-cheetah environment and demonstrate the superior performance of TIGR compared with state-of-the-art meta-RL approaches in terms of sample efficiency (three to ten times faster), asymptotic performance, and applicability in nonparametric and nonstationary environments with zero-shot adaptation. Videos can be viewed at https://videoviewsite.wixsite.com/tigr.
引用
收藏
页码:13604 / 13618
页数:15
相关论文
共 50 条
  • [1] Context-Based Meta-Reinforcement Learning With Bayesian Nonparametric Models
    Bing, Zhenshan
    Yun, Yuqi
    Huang, Kai
    Knoll, Alois
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (10) : 6948 - 6965
  • [2] Meta-Reinforcement Learning in Non-Stationary and Dynamic Environments
    Bing, Zhenshan
    Lerch, David
    Huang, Kai
    Knoll, Alois
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3476 - 3491
  • [3] Hypernetworks in Meta-Reinforcement Learning
    Beck, Jacob
    Jackson, Matthew
    Vuorio, Risto
    Whiteson, Shimon
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1478 - 1487
  • [4] XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX
    Nikulin, Alexander
    Agarkov, Artem
    Kurenkov, Vladislav
    Sinii, Viacheslav
    Zisman, Ilya
    Kolesnikov, Sergey
    arXiv, 2023,
  • [5] Prefrontal cortex as a meta-reinforcement learning system
    Jane X. Wang
    Zeb Kurth-Nelson
    Dharshan Kumaran
    Dhruva Tirumala
    Hubert Soyer
    Joel Z. Leibo
    Demis Hassabis
    Matthew Botvinick
    Nature Neuroscience, 2018, 21 : 860 - 868
  • [6] Offline Meta-Reinforcement Learning for Industrial Insertion
    Zhao, Tony Z.
    Luo, Jianlan
    Sushkov, Oleg
    Pevceviciute, Rugile
    Heess, Nicolas
    Scholz, Jon
    Schaal, Stefan
    Levine, Sergey
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 6386 - 6393
  • [7] A Meta-Reinforcement Learning Approach to Process Control
    McClement, Daniel G.
    Lawrence, Nathan P.
    Loewen, Philip D.
    Forbes, Michael G.
    Backstrom, Johan U.
    Gopaluni, R. Bhushan
    IFAC PAPERSONLINE, 2021, 54 (03): : 685 - 692
  • [8] Unsupervised Curricula for Visual Meta-Reinforcement Learning
    Jabri, Allan
    Hsu, Kyle
    Eysenbach, Benjamin
    Gupta, Abhishek
    Levine, Sergey
    Finn, Chelsea
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [9] Meta-Reinforcement Learning of Structured Exploration Strategies
    Gupta, Abhishek
    Mendonca, Russell
    Liu, YuXuan
    Abbeel, Pieter
    Levine, Sergey
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [10] Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation
    Hu, Hangkai
    Huang, Gao
    Li, Xiang
    Song, Shiji
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (03) : 1454 - 1464