Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

被引：5

作者：

Zhou, Da-Wei ^{[1
,2
]}

Cai, Zi-Wen ^{[1
,2
]}

Ye, Han-Jia ^{[1
,2
]}

Zhan, De-Chuan ^{[1
,2
]}

Liu, Ziwei ^{[3
]}

机构：

[1] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China

[2] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China

[3] Nanyang Technol Univ, S Lab, Singapore City 639798, Singapore

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2025年 / 133卷 / 03期

关键词：

Class-incremental learning; Pre-trained models; Continual learning; Catastrophic forgetting; REPRESENTATION;

D O I：

10.1007/s11263-024-02218-0

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM's generalizability and adapted model's adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL.

引用

页码：1012 / 1032

页数：21

共 50 条

[41] Pre-trained deep learning models for brain MRI image classification
Krishnapriya, Srigiri
Karuna, Yepuganti
FRONTIERS IN HUMAN NEUROSCIENCE, 2023, 17
[42] Federated Learning of Models Pre-Trained on Different Features with Consensus Graphs
Ma, Tengfei
Hoang, Trong Nghia
Chen, Jie
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1336 - 1346
[43] Learning Sample Difficulty from Pre-trained Models for Reliable Prediction
Cui, Peng
Zhang, Dan
Deng, Zhijie
Dong, Yinpeng
Zhu, Jun
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[44] Adaptive Textual Label Noise Learning based on Pre-trained Models
Cheng, Shaohuan
Chen, Wenyu
Fu, Mingsheng
Xie, Xuanting
Qu, Hong
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3174 - 3188
[45] ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning
Liu, Shangqing
Wu, Bozhi
Xie, Xiaofei
Meng, Guozhu
Liu, Yang
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 2476 - 2487
[46] Quality of Pre-trained Deep-Learning Models for Palmprint Recognition
Rosca, Valentin
Ignat, Anca
2020 22ND INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2020), 2020, : 202 - 209
[47] Mass detection in mammograms using pre-trained deep learning models
Agarwal, Richa
Diaz, Oliver
Llado, Xavier
Marti, Robert
14TH INTERNATIONAL WORKSHOP ON BREAST IMAGING (IWBI 2018), 2018, 10718
[48] An Approach to Run Pre-Trained Deep Learning Models on Grayscale Images
Ahmad, Ijaz
Shin, Seokjoo
3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 177 - 180
[49] Class-Incremental Learning for Semantic Segmentation in Aerial Imagery via Distillation in All Aspects
Shan, Lianlei
Wang, Weiqiang
Lv, Ke
Luo, Bin
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[50] S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning
Wang, Yabin
Huang, Zhiwu
Hong, Xiaopeng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,

← 1 2 3 4 5 →