MAML2: meta reinforcement learning via meta-learning for task categories

被引:5
|
作者
Fu, Qiming [1 ]
Wang, Zhechao [1 ]
Fang, Nengwei [2 ]
Xing, Bin [2 ]
Zhang, Xiao [3 ]
Chen, Jianping [4 ]
机构
[1] Suzhou Univ Sci & Technol, Sch Elect & Informat Engn, Suzhou 215009, Peoples R China
[2] Chongqing Ind Big Data Innovat Ctr Co Ltd, Chongqing 404100, Peoples R China
[3] Xuzhou Med Univ, Sch Med Informat, Xuzhou 221004, Peoples R China
[4] Suzhou Univ Sci & Technol, Sch Architecture & Urban Planning, Suzhou 215009, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
meta-learning; reinforcement learning; few-shot learning; negative adaptation;
D O I
10.1007/s11704-022-2037-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Meta-learning has been widely applied to solving few-shot reinforcement learning problems, where we hope to obtain an agent that can learn quickly in a new task. However, these algorithms often ignore some isolated tasks in pursuit of the average performance, which may result in negative adaptation in these isolated tasks, and they usually need sufficient learning in a stationary task distribution. In this paper, our algorithm presents a hierarchical framework of double meta-learning, and the whole framework includes classification, meta-learning, and re-adaptation. Firstly, in the classification process, we classify tasks into several task subsets, considered as some categories of tasks, by learned parameters of each task, which can separate out some isolated tasks thereafter. Secondly, in the meta-learning process, we learn category parameters in all subsets via meta-learning. Simultaneously, based on the gradient of each category parameter in each subset, we use meta-learning again to learn a new meta-parameter related to the whole task set, which can be used as an initial parameter for the new task. Finally, in the re-adaption process, we adapt the parameter of the new task with two steps, by the meta-parameter and the appropriate category parameter successively. Experimentally, we demonstrate our algorithm prevents the agent from negative adaptation without losing the average performance for the whole task set. Additionally, our algorithm presents a more rapid adaptation process within re-adaptation. Moreover, we show the good performance of our algorithm with fewer samples as the agent is exposed to an online meta-learning setting.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Learning to Balance Local Losses via Meta-Learning
    Yoa, Seungdong
    Jeon, Minkyu
    Oh, Youngjin
    Kim, Hyunwoo J.
    IEEE ACCESS, 2021, 9 : 130834 - 130844
  • [22] A Collaborative Learning Framework via Federated Meta-Learning
    Lin, Sen
    Yang, Guang
    Zhang, Junshan
    2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2020, : 289 - 299
  • [23] Learning Meta-Learning (LML) dataset: Survey data of meta-learning parameters
    Corraya, Sonia
    Al Mamun, Shamim
    Kaiser, M. Shamim
    DATA IN BRIEF, 2023, 51
  • [24] Multimodal meta-learning through meta-learned task representations
    Anna Vettoruzzo
    Mohamed-Rafik Bouguelia
    Thorsteinn Rögnvaldsson
    Neural Computing and Applications, 2024, 36 : 8519 - 8529
  • [25] Prediction Guided Meta-Learning for Multi-Objective Reinforcement Learning
    Liu, Fei-Yu
    Qian, Chao
    2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021), 2021, : 2171 - 2178
  • [26] Multimodal meta-learning through meta-learned task representations
    Vettoruzzo, Anna
    Bouguelia, Mohamed-Rafik
    Rognvaldsson, Thorsteinn
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (15): : 8519 - 8529
  • [27] Learning to Forget for Meta-Learning
    Baik, Sungyong
    Hong, Seokil
    Lee, Kyoung Mu
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2376 - 2384
  • [28] Clustered Task-Aware Meta-Learning by Learning From Learning Paths
    Peng, Danni
    Pan, Sinno Jialin
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 9426 - 9438
  • [29] Adaptive guidance and integrated navigation with reinforcement meta-learning
    Gaudet, Brian
    Linares, Richard
    Furfaro, Roberto
    ACTA ASTRONAUTICA, 2020, 169 : 180 - 190
  • [30] Robust Task Representations for Offline Meta-Reinforcement Learning via Contrastive Learning
    Yuan, Haoqi
    Lu, Zongqing
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,