MAML2: meta reinforcement learning via meta-learning for task categories

被引:5
|
作者
Fu, Qiming [1 ]
Wang, Zhechao [1 ]
Fang, Nengwei [2 ]
Xing, Bin [2 ]
Zhang, Xiao [3 ]
Chen, Jianping [4 ]
机构
[1] Suzhou Univ Sci & Technol, Sch Elect & Informat Engn, Suzhou 215009, Peoples R China
[2] Chongqing Ind Big Data Innovat Ctr Co Ltd, Chongqing 404100, Peoples R China
[3] Xuzhou Med Univ, Sch Med Informat, Xuzhou 221004, Peoples R China
[4] Suzhou Univ Sci & Technol, Sch Architecture & Urban Planning, Suzhou 215009, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
meta-learning; reinforcement learning; few-shot learning; negative adaptation;
D O I
10.1007/s11704-022-2037-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Meta-learning has been widely applied to solving few-shot reinforcement learning problems, where we hope to obtain an agent that can learn quickly in a new task. However, these algorithms often ignore some isolated tasks in pursuit of the average performance, which may result in negative adaptation in these isolated tasks, and they usually need sufficient learning in a stationary task distribution. In this paper, our algorithm presents a hierarchical framework of double meta-learning, and the whole framework includes classification, meta-learning, and re-adaptation. Firstly, in the classification process, we classify tasks into several task subsets, considered as some categories of tasks, by learned parameters of each task, which can separate out some isolated tasks thereafter. Secondly, in the meta-learning process, we learn category parameters in all subsets via meta-learning. Simultaneously, based on the gradient of each category parameter in each subset, we use meta-learning again to learn a new meta-parameter related to the whole task set, which can be used as an initial parameter for the new task. Finally, in the re-adaption process, we adapt the parameter of the new task with two steps, by the meta-parameter and the appropriate category parameter successively. Experimentally, we demonstrate our algorithm prevents the agent from negative adaptation without losing the average performance for the whole task set. Additionally, our algorithm presents a more rapid adaptation process within re-adaptation. Moreover, we show the good performance of our algorithm with fewer samples as the agent is exposed to an online meta-learning setting.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Meta-learning in Reinforcement Learning
    Schweighofer, N
    Doya, K
    NEURAL NETWORKS, 2003, 16 (01) : 5 - 9
  • [2] Multi-Task Reinforcement Meta-Learning in Neural Networks
    Shakah, Ghazi
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (07) : 263 - 269
  • [3] Evo-MAML: Meta-Learning with Evolving Gradient
    Chen, Jiaxing
    Yuan, Weilin
    Chen, Shaofei
    Hu, Zhenzhen
    Li, Peng
    ELECTRONICS, 2023, 12 (18)
  • [4] ROBUST MAML: PRIORITIZATION TASK BUFFER WITH ADAPTIVE LEARNING PROCESS FOR MODEL-AGNOSTIC META-LEARNING
    Thanh Nguyen
    Tung Luu
    Trung Pham
    Rakhimkul, Sanzhar
    Yoo, Chang D.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3460 - 3464
  • [5] Improving Generalization in Meta-learning via Task Augmentation
    Yao, Huaxiu
    Huang, Long-Kai
    Zhang, Linjun
    Wei, Ying
    Tian, Li
    Zou, James
    Huang, Junzhou
    Li, Zhenhui
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [6] Towards Task Sampler Learning for Meta-Learning
    Wang, Jingyao
    Qiang, Wenwen
    Su, Xingzhe
    Zheng, Changwen
    Sun, Fuchun
    Xiong, Hui
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 5534 - 5564
  • [7] Meta-Learning for Multi-objective Reinforcement Learning
    Chen, Xi
    Ghadirzadeh, Ali
    Bjorkman, Marten
    Jensfelt, Patric
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 977 - 983
  • [8] TASK2VEC: Task Embedding for Meta-Learning
    Achille, Alessandro
    Lam, Michael
    Tewari, Rahul
    Ravichandran, Avinash
    Maji, Subhransu
    Fowlkes, Charless
    Soatto, Stefano
    Perona, Pietro
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6439 - 6448
  • [9] Leveraging Task Variability in Meta-learning
    Aimen A.
    Ladrecha B.
    Sidheekh S.
    Krishnan N.C.
    SN Computer Science, 4 (5)
  • [10] Meta-learning with an Adaptive Task Scheduler
    Yao, Huaxiu
    Wang, Yu
    Wei, Ying
    Zhao, Peilin
    Mahdavi, Mehrdad
    Lian, Defu
    Finn, Chelsea
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34