Variational Skill Embeddings for Meta Reinforcement Learning

被引:2
|
作者
Chien, Jen-Tzung [1 ]
Lai, Weiwei [1 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Inst Elect & Comp Engn, Hsinchu, Taiwan
关键词
D O I
10.1109/IJCNN54540.2023.10191425
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Meta reinforcement learning (meta-RL) aims to learn useful prior knowledge across tasks which can be generalized to unseen but similar tasks with only a small number of adaptation steps. Traditionally, the gradient-based metal RL was proposed to use the gradients to learn the parameters of an adaptive policy from different tasks which likely lacked sample efficiency. Recently, the context-based meta-RL improved the efficiency by learning the embeddings of the trajectories based on context representation. The learned policy can be adapted to new tasks, but the performance is bounded due to a simple context encoder. To deal with this insufficiency, this paper presents a novel regularized meta-RL where the generalization of policy is enhanced through a context-based meta-RL where the conditional variational autoencoder consisting of a context-skill encoder and a soft-actor-critic decoder is implemented. The proposed method pursues the model regularization by discovering the shared skill patterns across tasks in implementation of context-based meta-RL. The experiments on a number of benchmark tasks show the merit of variational skill embeddings for regularized meta-RL.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] VIREL: A Variational Inference Framework for Reinforcement Learning
    Fellows, Matthew
    Mahajan, Anuj
    Rudner, Tim G. J.
    Whiteson, Shimon
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [22] The Dreaming Variational Autoencoder for Reinforcement Learning Environments
    Andersen, Per-Arne
    Goodwin, Morten
    Granmo, Ole-Christoffer
    [J]. ARTIFICIAL INTELLIGENCE XXXV (AI 2018), 2018, 11311 : 143 - 155
  • [23] Variational Bayesian Reinforcement Learning with Regret Bounds
    O'Donoghue, Brendan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [24] Variational Policy Chaining for Lifelong Reinforcement Learning
    Doyle, Christopher
    Gueriau, Maxime
    Dusparic, Ivana
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1546 - 1550
  • [25] Variational Quantum Circuits for Deep Reinforcement Learning
    Chen, Samuel Yen-Chi
    Yang, Chao-Han Huck
    Qi, Jun
    Chen, Pin-Yu
    Ma, Xiaoli
    Goan, Hsi-Sheng
    [J]. IEEE ACCESS, 2020, 8 : 141007 - 141024
  • [26] Building Portable Options: Skill Transfer in Reinforcement Learning
    Konidaris, George
    Barto, Andrew
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 895 - 900
  • [27] Unsupervised Reinforcement Learning for Transferable Manipulation Skill Discovery
    Cho, Daesol
    Kim, Jigang
    Kim, H. Jin
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 7455 - 7462
  • [28] Visual tracking skill reinforcement learning for a mobile robot
    Boada, MJL
    Salichs, MA
    [J]. INTELLIGENT AUTONOMOUS VEHICLES 2001, 2002, : 173 - 178
  • [29] Graph-Based Skill Acquisition For Reinforcement Learning
    Mendonca, Matheus R. F.
    Ziviani, Artur
    Barreto, Andre M. S.
    [J]. ACM COMPUTING SURVEYS, 2019, 52 (01)
  • [30] Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning
    Luo, Jianlan
    Dong, Perry
    Wu, Jeffrey
    Kumar, Aviral
    Geng, Xinyang
    Levine, Sergey
    [J]. CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229