Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

被引:5
|
作者
Zhou, Da-Wei [1 ,2 ]
Cai, Zi-Wen [1 ,2 ]
Ye, Han-Jia [1 ,2 ]
Zhan, De-Chuan [1 ,2 ]
Liu, Ziwei [3 ]
机构
[1] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China
[2] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[3] Nanyang Technol Univ, S Lab, Singapore City 639798, Singapore
关键词
Class-incremental learning; Pre-trained models; Continual learning; Catastrophic forgetting; REPRESENTATION;
D O I
10.1007/s11263-024-02218-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM's generalizability and adapted model's adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL.
引用
收藏
页码:1012 / 1032
页数:21
相关论文
共 50 条
  • [1] Class-Incremental Learning with Strong Pre-trained Models
    Wu, Tz-Ying
    Swaminathan, Gurumurthy
    Li, Zhizhong
    Ravichandran, Avinash
    Vasconcelos, Nuno
    Bhotika, Rahul
    Soatto, Stefano
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9591 - 9600
  • [2] Class-Incremental Learning Based on Big Dataset Pre-Trained Models
    Wen, Bin
    Zhu, Qiuyu
    IEEE ACCESS, 2023, 11 : 62028 - 62038
  • [3] Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning
    Zhou, Da-Wei
    Sun, Hai-Long
    Ye, Han-Jia
    Zhan, De-Chuan
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 23554 - 23564
  • [4] Adapt and Refine: A Few-Shot Class-Incremental Learner via Pre-Trained Models
    Qiang, Sunyuan
    Xiong, Zhu
    Liang, Yanyan
    Wan, Jun
    Zhang, Du
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT 1, 2025, 15031 : 431 - 444
  • [5] Upstream Mitigation Is Not All You Need: Testing the Bias Transfer Hypothesis in Pre-Trained Language Models
    Steed, Ryan
    Panda, Swetasudha
    Kobren, Ari
    Wick, Michael
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3524 - 3542
  • [6] FILP-3D: Enhancing 3D few-shot class-incremental learning with pre-trained vision-language models
    Xu, Wan
    Huang, Tianyu
    Qu, Tianyuan
    Yang, Guanglei
    Guo, Yiwen
    Zuo, Wangmeng
    PATTERN RECOGNITION, 2025, 165
  • [7] Revisiting Pre-trained Models for Chinese Natural Language Processing
    Cui, Yiming
    Che, Wanxiang
    Liu, Ting
    Qin, Bing
    Wang, Shijin
    Hu, Guoping
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 657 - 668
  • [8] Learning to Modulate pre-trained Models in RL
    Schmied, Thomas
    Hofmarcher, Markus
    Paischer, Fabian
    Pascanu, Razvan
    Hochreiter, Sepp
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] Continual Learning with Pre-Trained Models: A Survey
    Zhou, Da-Wei
    Sun, Hai-Long
    Ning, Jingyi
    Ye, Han-Jia
    Zhan, De-Chuan
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 8363 - 8371
  • [10] iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning
    Fischer, Tom
    Liu, Yaoyao
    Jesslen, Artur
    Ahmed, Noor
    Kaushik, Prakhar
    Wang, Angtian
    Yuille, Alan L.
    Kortylewski, Adam
    Ilg, Eddy
    COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 357 - 374