MAML2: meta reinforcement learning via meta-learning for task categories

被引:5
|
作者
Fu, Qiming [1 ]
Wang, Zhechao [1 ]
Fang, Nengwei [2 ]
Xing, Bin [2 ]
Zhang, Xiao [3 ]
Chen, Jianping [4 ]
机构
[1] Suzhou Univ Sci & Technol, Sch Elect & Informat Engn, Suzhou 215009, Peoples R China
[2] Chongqing Ind Big Data Innovat Ctr Co Ltd, Chongqing 404100, Peoples R China
[3] Xuzhou Med Univ, Sch Med Informat, Xuzhou 221004, Peoples R China
[4] Suzhou Univ Sci & Technol, Sch Architecture & Urban Planning, Suzhou 215009, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
meta-learning; reinforcement learning; few-shot learning; negative adaptation;
D O I
10.1007/s11704-022-2037-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Meta-learning has been widely applied to solving few-shot reinforcement learning problems, where we hope to obtain an agent that can learn quickly in a new task. However, these algorithms often ignore some isolated tasks in pursuit of the average performance, which may result in negative adaptation in these isolated tasks, and they usually need sufficient learning in a stationary task distribution. In this paper, our algorithm presents a hierarchical framework of double meta-learning, and the whole framework includes classification, meta-learning, and re-adaptation. Firstly, in the classification process, we classify tasks into several task subsets, considered as some categories of tasks, by learned parameters of each task, which can separate out some isolated tasks thereafter. Secondly, in the meta-learning process, we learn category parameters in all subsets via meta-learning. Simultaneously, based on the gradient of each category parameter in each subset, we use meta-learning again to learn a new meta-parameter related to the whole task set, which can be used as an initial parameter for the new task. Finally, in the re-adaption process, we adapt the parameter of the new task with two steps, by the meta-parameter and the appropriate category parameter successively. Experimentally, we demonstrate our algorithm prevents the agent from negative adaptation without losing the average performance for the whole task set. Additionally, our algorithm presents a more rapid adaptation process within re-adaptation. Moreover, we show the good performance of our algorithm with fewer samples as the agent is exposed to an online meta-learning setting.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Meta-Learning via Weighted Gradient Update
    Xu, Zhixiong
    Cao, Lei
    Chen, Xiliang
    IEEE ACCESS, 2019, 7 : 110846 - 110855
  • [42] Personalizing Dialogue Agents via Meta-Learning
    Madotto, Andrea
    Lin, Zhaojiang
    Wu, Chien-Sheng
    Fung, Pascale
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5454 - 5459
  • [43] Edge sparsification for graphs via meta-learning
    Wan, Guihong
    Schweitzer, Haim
    Proceedings - International Conference on Data Engineering, 2021, 2021-April : 2733 - 2738
  • [44] Exploration With Task Information for Meta Reinforcement Learning
    Jiang, Peng
    Song, Shiji
    Huang, Gao
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 4033 - 4046
  • [45] Automated imbalanced classification via meta-learning
    Moniz, Nuno
    Cerqueira, Vitor
    Expert Systems with Applications, 2021, 178
  • [46] Fast Context Adaptation via Meta-Learning
    Zintgraf, Luisa
    Shiarlis, Kyriacos
    Kurin, Vitaly
    Hofmann, Katja
    Whiteson, Shimon
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [47] Automated imbalanced classification via meta-learning
    Moniz, Nuno
    Cerqueira, Vitor
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 178
  • [48] Online Meta-Learning
    Finn, Chelsea
    Rajeswaran, Aravind
    Kakade, Sham
    Levine, Sergey
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [49] Submodular Meta-Learning
    Adibi, Arman
    Mokhtari, Aryan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [50] Meta-learning with backpropagation
    Younger, AS
    Hochreiter, S
    Conwell, PR
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 2001 - 2006