Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

被引:5
|
作者
Zhou, Da-Wei [1 ,2 ]
Cai, Zi-Wen [1 ,2 ]
Ye, Han-Jia [1 ,2 ]
Zhan, De-Chuan [1 ,2 ]
Liu, Ziwei [3 ]
机构
[1] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China
[2] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[3] Nanyang Technol Univ, S Lab, Singapore City 639798, Singapore
关键词
Class-incremental learning; Pre-trained models; Continual learning; Catastrophic forgetting; REPRESENTATION;
D O I
10.1007/s11263-024-02218-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM's generalizability and adapted model's adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL.
引用
收藏
页码:1012 / 1032
页数:21
相关论文
共 50 条
  • [21] CODEEDITOR: Learning to Edit Source Code with Pre-trained Models
    Li, Jia
    Li, Ge
    Li, Zhuo
    Jin, Zhi
    Hu, Xing
    Zhang, Kechi
    Fu, Zhiyi
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (06)
  • [22] Zero-shot Learning for Subdiscrimination in Pre-trained Models
    Dominguez-Mateos, Francisco
    O'Brien, Vincent
    Garland, James
    Furlong, Ryan
    Palacios-Alonso, Daniel
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2025, 31 (01) : 93 - 110
  • [23] Collaborative Learning across Heterogeneous Systems with Pre-Trained Models
    Hoang, Trong Nghia
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22668 - 22668
  • [24] MODEL SPIDER: Learning to Rank Pre-Trained Models Efficiently
    Zhang, Yi-Kai
    Huang, Ting-Ji
    Ding, Yao-Xiang
    Zhan, De-Chuan
    Ye, Han-Jia
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [25] Meta Distant Transfer Learning for Pre-trained Language Models
    Wang, Chengyu
    Pan, Haojie
    Qiu, Minghui
    Yang, Fei
    Huang, Jun
    Zhang, Yin
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9742 - 9752
  • [26] Do Pre-trained Models Benefit Equally in Continual Learning?
    Lee, Kuan-Ying
    Zhong, Yuanyi
    Wang, Yu-Xiong
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6474 - 6482
  • [27] LogME: Practical Assessment of Pre-trained Models for Transfer Learning
    You, Kaichao
    Liu, Yong
    Wang, Jianmin
    Long, Mingsheng
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [28] Guiding The Last Layer in Federated Learning with Pre-Trained Models
    Legate, Gwen
    Bernier, Nicolas
    Caccia, Lucas
    Oyallon, Edouard
    Belilovsky, Eugene
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [29] Supervised Contrastive Replay: Revisiting the Nearest Class Mean Classifier in Online Class-Incremental Continual Learning
    Mai, Zheda
    Li, Ruiwen
    Kim, Hyunwoo
    Sanner, Scott
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3584 - 3594
  • [30] Revisiting k-NN for Fine-Tuning Pre-trained Language Models
    Li, Lei
    Chen, Jing
    Tian, Botzhong
    Zhang, Ningyu
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 327 - 338