MAML2: meta reinforcement learning via meta-learning for task categories

被引：5

作者：

Fu, Qiming ^{[1
]}

Wang, Zhechao ^{[1
]}

Fang, Nengwei ^{[2
]}

Xing, Bin ^{[2
]}

Zhang, Xiao ^{[3
]}

Chen, Jianping ^{[4
]}

机构：

[1] Suzhou Univ Sci & Technol, Sch Elect & Informat Engn, Suzhou 215009, Peoples R China

[2] Chongqing Ind Big Data Innovat Ctr Co Ltd, Chongqing 404100, Peoples R China

[3] Xuzhou Med Univ, Sch Med Informat, Xuzhou 221004, Peoples R China

[4] Suzhou Univ Sci & Technol, Sch Architecture & Urban Planning, Suzhou 215009, Peoples R China

来源：

FRONTIERS OF COMPUTER SCIENCE | 2023年 / 17卷 / 04期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

meta-learning; reinforcement learning; few-shot learning; negative adaptation;

D O I：

10.1007/s11704-022-2037-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Meta-learning has been widely applied to solving few-shot reinforcement learning problems, where we hope to obtain an agent that can learn quickly in a new task. However, these algorithms often ignore some isolated tasks in pursuit of the average performance, which may result in negative adaptation in these isolated tasks, and they usually need sufficient learning in a stationary task distribution. In this paper, our algorithm presents a hierarchical framework of double meta-learning, and the whole framework includes classification, meta-learning, and re-adaptation. Firstly, in the classification process, we classify tasks into several task subsets, considered as some categories of tasks, by learned parameters of each task, which can separate out some isolated tasks thereafter. Secondly, in the meta-learning process, we learn category parameters in all subsets via meta-learning. Simultaneously, based on the gradient of each category parameter in each subset, we use meta-learning again to learn a new meta-parameter related to the whole task set, which can be used as an initial parameter for the new task. Finally, in the re-adaption process, we adapt the parameter of the new task with two steps, by the meta-parameter and the appropriate category parameter successively. Experimentally, we demonstrate our algorithm prevents the agent from negative adaptation without losing the average performance for the whole task set. Additionally, our algorithm presents a more rapid adaptation process within re-adaptation. Moreover, we show the good performance of our algorithm with fewer samples as the agent is exposed to an online meta-learning setting.

引用

页数：11

共 50 条

[31] The reinforcement metalearner as a biologically plausible meta-learning framework
Vriens, Tim
Horan, Mattias
Gottlieb, Jacqueline
Silvetti, Massimo
BEHAVIORAL AND BRAIN SCIENCES, 2024, 47
[32] Towards well-generalizing meta-learning via adversarial task augmentation
Wang, Haoqing
Mai, Huiyu
Gong, Yuhang
Deng, Zhi-Hong
ARTIFICIAL INTELLIGENCE, 2023, 317
[33] Task Aligned Generative Meta-learning for Zero-shot Learning
Liu, Zhe
Li, Yun
Yao, Lina
Wang, Xianzhi
Long, Guodong
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8723 - 8731
[34] Learning to Transfer: Unsupervised Domain Translation via Meta-Learning
Lin, Jianxin
Wang, Yijun
Chen, Zhibo
He, Tianyu
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11507 - 11514
[35] Improving progressive sampling via meta-learning on learning curves
Leite, R
Brazdil, P
MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 : 250 - 261
[36] Automatic Modulation Classification via Meta-Learning
Hao, Xiaoyang
Feng, Zhixi
Yang, Shuyuan
Wang, Min
Jiao, Licheng
IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (14) : 12276 - 12292
[37] Dynamic Graph Embedding via Meta-Learning
Mao, Yuren
Hao, Yu
Cao, Xin
Fang, Yixiang
Lin, Xuemin
Mao, Hua
Xu, Zhiqiang
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (07) : 2967 - 2979
[38] Edge Sparsification for Graphs via Meta-Learning
Wan, Guihong
2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2733 - 2738
[39] Incremental Object Detection via Meta-Learning
Joseph, K. J.
Rajasegaran, Jathushan
Khan, Salman
Khan, Fahad Shahbaz
Balasubramanian, Vineeth N.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9209 - 9216
[40] Adversarial Task Up-sampling for Meta-learning
Wu, Yichen
Huang, Long-Kai
Wei, Ying
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,

← 1 2 3 4 5 →