Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

被引:5
|
作者
Zhou, Da-Wei [1 ,2 ]
Cai, Zi-Wen [1 ,2 ]
Ye, Han-Jia [1 ,2 ]
Zhan, De-Chuan [1 ,2 ]
Liu, Ziwei [3 ]
机构
[1] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China
[2] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[3] Nanyang Technol Univ, S Lab, Singapore City 639798, Singapore
关键词
Class-incremental learning; Pre-trained models; Continual learning; Catastrophic forgetting; REPRESENTATION;
D O I
10.1007/s11263-024-02218-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM's generalizability and adapted model's adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL.
引用
收藏
页码:1012 / 1032
页数:21
相关论文
共 50 条
  • [41] Pre-trained deep learning models for brain MRI image classification
    Krishnapriya, Srigiri
    Karuna, Yepuganti
    FRONTIERS IN HUMAN NEUROSCIENCE, 2023, 17
  • [42] Federated Learning of Models Pre-Trained on Different Features with Consensus Graphs
    Ma, Tengfei
    Hoang, Trong Nghia
    Chen, Jie
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1336 - 1346
  • [43] Learning Sample Difficulty from Pre-trained Models for Reliable Prediction
    Cui, Peng
    Zhang, Dan
    Deng, Zhijie
    Dong, Yinpeng
    Zhu, Jun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [44] Adaptive Textual Label Noise Learning based on Pre-trained Models
    Cheng, Shaohuan
    Chen, Wenyu
    Fu, Mingsheng
    Xie, Xuanting
    Qu, Hong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3174 - 3188
  • [45] ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning
    Liu, Shangqing
    Wu, Bozhi
    Xie, Xiaofei
    Meng, Guozhu
    Liu, Yang
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 2476 - 2487
  • [46] Quality of Pre-trained Deep-Learning Models for Palmprint Recognition
    Rosca, Valentin
    Ignat, Anca
    2020 22ND INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2020), 2020, : 202 - 209
  • [47] Mass detection in mammograms using pre-trained deep learning models
    Agarwal, Richa
    Diaz, Oliver
    Llado, Xavier
    Marti, Robert
    14TH INTERNATIONAL WORKSHOP ON BREAST IMAGING (IWBI 2018), 2018, 10718
  • [48] An Approach to Run Pre-Trained Deep Learning Models on Grayscale Images
    Ahmad, Ijaz
    Shin, Seokjoo
    3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 177 - 180
  • [49] Class-Incremental Learning for Semantic Segmentation in Aerial Imagery via Distillation in All Aspects
    Shan, Lianlei
    Wang, Weiqiang
    Lv, Ke
    Luo, Bin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [50] S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning
    Wang, Yabin
    Huang, Zhiwu
    Hong, Xiaopeng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,