Hypernetworks for Zero-Shot Transfer in Reinforcement Learning

被引:0
|
作者
Rezaei-Shoshtari, Sahand [1 ,2 ,3 ]
Morissette, Charlotte [1 ,3 ]
Hogan, Francois R. [3 ]
Dudek, Gregory [1 ,2 ,3 ]
Meger, David [1 ,2 ,3 ]
机构
[1] McGill Univ, Montreal, PQ, Canada
[2] Mila Quebec AI Inst, Montreal, PQ, Canada
[3] Samsung AI Ctr Montreal, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objective and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hyper-network that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous con-trol tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multi-task and meta RL approaches.
引用
下载
收藏
页码:9579 / 9587
页数:9
相关论文
共 50 条
  • [1] DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
    Higgins, Irina
    Pal, Arka
    Rusu, Andrei
    Matthey, Loic
    Burgess, Christopher
    Pritzel, Alexander
    Botyinick, Matthew
    Blundell, Charles
    Lerchner, Alexander
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [2] Constrained GPI for Zero-Shot Transfer in Reinforcement Learning
    Kim, Jaekyeom
    Park, Seohong
    Kim, Gunhee
    Advances in Neural Information Processing Systems, 2022, 35
  • [3] Zero-Shot Policy Transfer in Autonomous Racing: Reinforcement Learning vs Imitation Learning
    Hamilton, Nathaniel
    Musau, Patrick
    Lopez, Diego Manzanas
    Johnson, Taylor T.
    2022 IEEE INTERNATIONAL CONFERENCE ON ASSURED AUTONOMY (ICAA 2022), 2022, : 11 - 20
  • [4] Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning
    Marom, Ofir
    Rosman, Benjamin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [5] A Survey of Zero-shot Generalisation in Deep Reinforcement Learning
    Kirk R.
    Zhang A.
    Grefenstette E.
    Rocktäschel T.
    Journal of Artificial Intelligence Research, 2023, 76 : 201 - 264
  • [6] Combining ontology and reinforcement learning for zero-shot classification
    Liu, Bin
    Yao, Li
    Ding, Zheyuan
    Xu, Junyi
    Wu, Junfeng
    KNOWLEDGE-BASED SYSTEMS, 2018, 144 : 42 - 50
  • [7] Environment Generation for Zero-Shot Compositional Reinforcement Learning
    Gur, Izzeddin
    Jaques, Natasha
    Miao, Yingjie
    Choi, Jongwook
    Tiwari, Manoj
    Lee, Honglak
    Faust, Aleksandra
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] A Survey of Zero-shot Generalisation in Deep Reinforcement Learning
    Kirk, Robert
    Zhang, Amy
    Grefenstette, Edward
    Rocktaeschel, Tim
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2023, 76 : 201 - 264
  • [9] Relational Knowledge Transfer for Zero-Shot Learning
    Wang, Donghui
    Li, Yanan
    Lin, Yuetan
    Zhuang, Yueting
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2145 - 2151
  • [10] Zero-Shot Transfer Learning for Event Extraction
    Huang, Lifu
    Ji, Heng
    Cho, Kyunghyun
    Dagan, Ido
    Riedel, Sebastian
    Voss, Clare R.
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 2160 - 2170