Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

被引：5

作者：

Zhou, Da-Wei ^{[1
,2
]}

Cai, Zi-Wen ^{[1
,2
]}

Ye, Han-Jia ^{[1
,2
]}

Zhan, De-Chuan ^{[1
,2
]}

Liu, Ziwei ^{[3
]}

机构：

[1] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China

[2] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China

[3] Nanyang Technol Univ, S Lab, Singapore City 639798, Singapore

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2025年 / 133卷 / 03期

关键词：

Class-incremental learning; Pre-trained models; Continual learning; Catastrophic forgetting; REPRESENTATION;

D O I：

10.1007/s11263-024-02218-0

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM's generalizability and adapted model's adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL.

引用

页码：1012 / 1032

页数：21

共 50 条

[21] CODEEDITOR: Learning to Edit Source Code with Pre-trained Models
Li, Jia
Li, Ge
Li, Zhuo
Jin, Zhi
Hu, Xing
Zhang, Kechi
Fu, Zhiyi
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (06)
[22] Zero-shot Learning for Subdiscrimination in Pre-trained Models
Dominguez-Mateos, Francisco
O'Brien, Vincent
Garland, James
Furlong, Ryan
Palacios-Alonso, Daniel
JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2025, 31 (01) : 93 - 110
[23] Collaborative Learning across Heterogeneous Systems with Pre-Trained Models
Hoang, Trong Nghia
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22668 - 22668
[24] MODEL SPIDER: Learning to Rank Pre-Trained Models Efficiently
Zhang, Yi-Kai
Huang, Ting-Ji
Ding, Yao-Xiang
Zhan, De-Chuan
Ye, Han-Jia
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[25] Meta Distant Transfer Learning for Pre-trained Language Models
Wang, Chengyu
Pan, Haojie
Qiu, Minghui
Yang, Fei
Huang, Jun
Zhang, Yin
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9742 - 9752
[26] Do Pre-trained Models Benefit Equally in Continual Learning?
Lee, Kuan-Ying
Zhong, Yuanyi
Wang, Yu-Xiong
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6474 - 6482
[27] LogME: Practical Assessment of Pre-trained Models for Transfer Learning
You, Kaichao
Liu, Yong
Wang, Jianmin
Long, Mingsheng
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[28] Guiding The Last Layer in Federated Learning with Pre-Trained Models
Legate, Gwen
Bernier, Nicolas
Caccia, Lucas
Oyallon, Edouard
Belilovsky, Eugene
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[29] Supervised Contrastive Replay: Revisiting the Nearest Class Mean Classifier in Online Class-Incremental Continual Learning
Mai, Zheda
Li, Ruiwen
Kim, Hyunwoo
Sanner, Scott
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3584 - 3594
[30] Revisiting k-NN for Fine-Tuning Pre-trained Language Models
Li, Lei
Chen, Jing
Tian, Botzhong
Zhang, Ningyu
CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 327 - 338

← 1 2 3 4 5 →