MAML2: meta reinforcement learning via meta-learning for task categories

被引:5
|
作者
Fu, Qiming [1 ]
Wang, Zhechao [1 ]
Fang, Nengwei [2 ]
Xing, Bin [2 ]
Zhang, Xiao [3 ]
Chen, Jianping [4 ]
机构
[1] Suzhou Univ Sci & Technol, Sch Elect & Informat Engn, Suzhou 215009, Peoples R China
[2] Chongqing Ind Big Data Innovat Ctr Co Ltd, Chongqing 404100, Peoples R China
[3] Xuzhou Med Univ, Sch Med Informat, Xuzhou 221004, Peoples R China
[4] Suzhou Univ Sci & Technol, Sch Architecture & Urban Planning, Suzhou 215009, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
meta-learning; reinforcement learning; few-shot learning; negative adaptation;
D O I
10.1007/s11704-022-2037-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Meta-learning has been widely applied to solving few-shot reinforcement learning problems, where we hope to obtain an agent that can learn quickly in a new task. However, these algorithms often ignore some isolated tasks in pursuit of the average performance, which may result in negative adaptation in these isolated tasks, and they usually need sufficient learning in a stationary task distribution. In this paper, our algorithm presents a hierarchical framework of double meta-learning, and the whole framework includes classification, meta-learning, and re-adaptation. Firstly, in the classification process, we classify tasks into several task subsets, considered as some categories of tasks, by learned parameters of each task, which can separate out some isolated tasks thereafter. Secondly, in the meta-learning process, we learn category parameters in all subsets via meta-learning. Simultaneously, based on the gradient of each category parameter in each subset, we use meta-learning again to learn a new meta-parameter related to the whole task set, which can be used as an initial parameter for the new task. Finally, in the re-adaption process, we adapt the parameter of the new task with two steps, by the meta-parameter and the appropriate category parameter successively. Experimentally, we demonstrate our algorithm prevents the agent from negative adaptation without losing the average performance for the whole task set. Additionally, our algorithm presents a more rapid adaptation process within re-adaptation. Moreover, we show the good performance of our algorithm with fewer samples as the agent is exposed to an online meta-learning setting.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] The reinforcement metalearner as a biologically plausible meta-learning framework
    Vriens, Tim
    Horan, Mattias
    Gottlieb, Jacqueline
    Silvetti, Massimo
    BEHAVIORAL AND BRAIN SCIENCES, 2024, 47
  • [32] Towards well-generalizing meta-learning via adversarial task augmentation
    Wang, Haoqing
    Mai, Huiyu
    Gong, Yuhang
    Deng, Zhi-Hong
    ARTIFICIAL INTELLIGENCE, 2023, 317
  • [33] Task Aligned Generative Meta-learning for Zero-shot Learning
    Liu, Zhe
    Li, Yun
    Yao, Lina
    Wang, Xianzhi
    Long, Guodong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8723 - 8731
  • [34] Learning to Transfer: Unsupervised Domain Translation via Meta-Learning
    Lin, Jianxin
    Wang, Yijun
    Chen, Zhibo
    He, Tianyu
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11507 - 11514
  • [35] Improving progressive sampling via meta-learning on learning curves
    Leite, R
    Brazdil, P
    MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 : 250 - 261
  • [36] Automatic Modulation Classification via Meta-Learning
    Hao, Xiaoyang
    Feng, Zhixi
    Yang, Shuyuan
    Wang, Min
    Jiao, Licheng
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (14) : 12276 - 12292
  • [37] Dynamic Graph Embedding via Meta-Learning
    Mao, Yuren
    Hao, Yu
    Cao, Xin
    Fang, Yixiang
    Lin, Xuemin
    Mao, Hua
    Xu, Zhiqiang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (07) : 2967 - 2979
  • [38] Edge Sparsification for Graphs via Meta-Learning
    Wan, Guihong
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2733 - 2738
  • [39] Incremental Object Detection via Meta-Learning
    Joseph, K. J.
    Rajasegaran, Jathushan
    Khan, Salman
    Khan, Fahad Shahbaz
    Balasubramanian, Vineeth N.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9209 - 9216
  • [40] Adversarial Task Up-sampling for Meta-learning
    Wu, Yichen
    Huang, Long-Kai
    Wei, Ying
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,