Recent Advances of Foundation Language Models-based Continual Learning: A Survey

被引:0
|
作者
Yang, Yutao [1 ]
Zhou, Jie [1 ]
Ding, Xuan wen [1 ]
Huai, Tianyu [1 ]
Liu, Shunyu [1 ]
Chen, Qin [1 ]
Xie, Yuan [1 ]
He, Liang [1 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China
关键词
Continual learning; foundation language models; pre-trained language models; large language models; vision-language models; survey; NEURAL-NETWORKS; LIFELONG;
D O I
10.1145/3705725
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recently, foundation language models (LMs) have marked significant achievements in the domains of natural language processing and computer vision. Unlike traditional neural network models, foundation LMs obtain a great ability for transfer learning by acquiring rich common sense knowledge through pre-training on extensive unsupervised datasets with a vast number of parameters. Despite these capabilities, LMs still struggle with catastrophic forgetting, hindering their ability to learn continuously like humans. To address this, continual learning (CL) methodologies have been introduced, allowing LMs to adapt to new tasks while retaining learned knowledge. However, a systematic taxonomy of existing approaches and a comparison of their performance are still lacking. In this article, we delve into a comprehensive review, summarization, and classification of the existing literature on CL-based approaches applied to foundation language models, such as pre-trained language models, large language models, and vision-language models. We divide these studies into offline and online CL, which consist of traditional methods, parameter-efficient-based methods, instruction tuning-based methods and continual pre-training methods. Additionally, we outline the typical datasets and metrics employed in CL research and provide a detailed analysis of the challenges and future work for LMs-based continual learning.
引用
收藏
页数:38
相关论文
共 50 条
  • [41] Recent Advances in Interactive Machine Translation With Large Language Models
    Wang, Yanshu
    Zhang, Jinyi
    Shi, Tianrong
    Deng, Dashuai
    Tian, Ye
    Matsumoto, Tadahiro
    IEEE ACCESS, 2024, 12 : 179353 - 179382
  • [42] Tool learning with large language models: a survey
    Qu, Changle
    Dai, Sunhao
    Wei, Xiaochi
    Cai, Hengyi
    Wang, Shuaiqiang
    Yin, Dawei
    Xu, Jun
    Wen, Ji-rong
    FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (08)
  • [43] Recent advances in reinforcement learning-based autonomous driving behavior planning: A survey
    Wu, Jingda
    Huang, Chao
    Huang, Hailong
    Lv, Chen
    Wang, Yuntong
    Wang, Fei-Yue
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2024, 164
  • [44] A Survey of Information Cascade Analysis: Models, Predictions, and Recent Advances
    Zhou, Fan
    Xu, Xovee
    Trajcevski, Goce
    Zhang, Kunpeng
    ACM COMPUTING SURVEYS, 2021, 54 (02)
  • [45] RECENT ADVANCES IN NATURAL LANGUAGE GENERATION: A SURVEY AND CLASSIFICATION OF THE EMPIRICAL LITERATURE
    Perera, Rivindu
    Nand, Parma
    COMPUTING AND INFORMATICS, 2017, 36 (01) : 1 - 32
  • [46] A Double Learning Models-Based Multi-Objective Estimation of Distribution Algorithm
    Lin, Yanyan
    Liu, Han
    Jiang, Qiaoyong
    IEEE ACCESS, 2019, 7 : 144580 - 144590
  • [47] Adaptive Teaching of the Iranian Sign Language Based on Continual Learning Algorithms
    Memari, Morteza
    Taheri, Alireza
    IEEE ACCESS, 2024, 12 : 164164 - 164177
  • [48] A Survey on Deep Active Learning: Recent Advances and New Frontiers
    Li, Dongyuan
    Wang, Zhen
    Chen, Yankai
    Jiang, Renhe
    Ding, Weiping
    Okumura, Manabu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 21
  • [49] Fine-tuning and prompt engineering for large language models-based code review automation
    Pornprasit, Chanathip
    Tantithamthavorn, Chakkrit
    INFORMATION AND SOFTWARE TECHNOLOGY, 2024, 175
  • [50] Bridging the Gap Between Qualitative and Quantitative Assessment in Science Education Research with Machine Learning — A Case for Pretrained Language Models-Based Clustering
    Peter Wulff
    David Buschhüter
    Andrea Westphal
    Lukas Mientus
    Anna Nowak
    Andreas Borowski
    Journal of Science Education and Technology, 2022, 31 : 490 - 513