Hypernetworks for Zero-Shot Transfer in Reinforcement Learning

被引:0
|
作者
Rezaei-Shoshtari, Sahand [1 ,2 ,3 ]
Morissette, Charlotte [1 ,3 ]
Hogan, Francois R. [3 ]
Dudek, Gregory [1 ,2 ,3 ]
Meger, David [1 ,2 ,3 ]
机构
[1] McGill Univ, Montreal, PQ, Canada
[2] Mila Quebec AI Inst, Montreal, PQ, Canada
[3] Samsung AI Ctr Montreal, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objective and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hyper-network that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous con-trol tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multi-task and meta RL approaches.
引用
收藏
页码:9579 / 9587
页数:9
相关论文
共 50 条
  • [1] DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
    Higgins, Irina
    Pal, Arka
    Rusu, Andrei
    Matthey, Loic
    Burgess, Christopher
    Pritzel, Alexander
    Botyinick, Matthew
    Blundell, Charles
    Lerchner, Alexander
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [2] Constrained GPI for Zero-Shot Transfer in Reinforcement Learning
    Kim, Jaekyeom
    Park, Seohong
    Kim, Gunhee
    [J]. Advances in Neural Information Processing Systems, 2022, 35
  • [3] Zero-Shot Policy Transfer in Autonomous Racing: Reinforcement Learning vs Imitation Learning
    Hamilton, Nathaniel
    Musau, Patrick
    Lopez, Diego Manzanas
    Johnson, Taylor T.
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ASSURED AUTONOMY (ICAA 2022), 2022, : 11 - 20
  • [4] Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning
    Marom, Ofir
    Rosman, Benjamin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [5] A Survey of Zero-shot Generalisation in Deep Reinforcement Learning
    Kirk, Robert
    Zhang, Amy
    Grefenstette, Edward
    Rocktäschel, Tim
    [J]. Journal of Artificial Intelligence Research, 2023, 76 : 201 - 264
  • [6] Combining ontology and reinforcement learning for zero-shot classification
    Liu, Bin
    Yao, Li
    Ding, Zheyuan
    Xu, Junyi
    Wu, Junfeng
    [J]. KNOWLEDGE-BASED SYSTEMS, 2018, 144 : 42 - 50
  • [7] Environment Generation for Zero-Shot Compositional Reinforcement Learning
    Gur, Izzeddin
    Jaques, Natasha
    Miao, Yingjie
    Choi, Jongwook
    Tiwari, Manoj
    Lee, Honglak
    Faust, Aleksandra
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] A Survey of Zero-shot Generalisation in Deep Reinforcement Learning
    Kirk, Robert
    Zhang, Amy
    Grefenstette, Edward
    Rocktaeschel, Tim
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2023, 76 : 201 - 264
  • [9] Relational Knowledge Transfer for Zero-Shot Learning
    Wang, Donghui
    Li, Yanan
    Lin, Yuetan
    Zhuang, Yueting
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2145 - 2151
  • [10] Transfer Increment for Generalized Zero-Shot Learning
    Feng, Liangjun
    Zhao, Chunhui
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2506 - 2520