Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

被引:5
|
作者
Zhou, Da-Wei [1 ,2 ]
Cai, Zi-Wen [1 ,2 ]
Ye, Han-Jia [1 ,2 ]
Zhan, De-Chuan [1 ,2 ]
Liu, Ziwei [3 ]
机构
[1] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China
[2] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[3] Nanyang Technol Univ, S Lab, Singapore City 639798, Singapore
关键词
Class-incremental learning; Pre-trained models; Continual learning; Catastrophic forgetting; REPRESENTATION;
D O I
10.1007/s11263-024-02218-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM's generalizability and adapted model's adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL.
引用
收藏
页码:1012 / 1032
页数:21
相关论文
共 50 条
  • [31] Character, Word, or Both? Revisiting the Segmentation Granularity for Chinese Pre-trained Language Models
    Liang, Xinnian
    Zhou, Zefan
    Huang, Hui
    Wu, Shuangzhi
    Xiao, Tong
    Yang, Muyun
    Li, Zhoujun
    Bian, Chao
    arXiv, 2023,
  • [32] Knowledge Representation by Generic Models for Few-Shot Class-Incremental Learning
    Chen, Xiaodong
    Jiang, Weijie
    Huang, Zhiyong
    Su, Jiangwen
    Yu, Yuanlong
    ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 1237 - 1247
  • [33] Backdoor Attacks Against Transfer Learning With Pre-Trained Deep Learning Models
    Wang, Shuo
    Nepal, Surya
    Rudolph, Carsten
    Grobler, Marthie
    Chen, Shangyu
    Chen, Tianle
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (03) : 1526 - 1539
  • [34] Classification of Regional Food Using Pre-Trained Transfer Learning Models
    Gadhiya, Jeet
    Khatik, Anjali
    Kodinariya, Shruti
    Ramoliya, Dipak
    7th International Conference on Electronics, Communication and Aerospace Technology, ICECA 2023 - Proceedings, 2023, : 1237 - 1241
  • [35] Unregistered Multiview Mammogram Analysis with Pre-trained Deep Learning Models
    Carneiro, Gustavo
    Nascimento, Jacinto
    Bradley, Andrew P.
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 : 652 - 660
  • [36] Enhancing generalizability and performance in drug-target interaction identification by integrating pharmacophore and pre-trained models
    Zhang, Zuolong
    He, Xin
    Long, Dazhi
    Luo, Gang
    Chen, Shengbo
    BIOINFORMATICS, 2024, 40 : i539 - i547
  • [37] Simple and Effective Multimodal Learning Based on Pre-Trained Transformer Models
    Miyazawa, Kazuki
    Kyuragi, Yuta
    Nagai, Takayuki
    IEEE ACCESS, 2022, 10 : 29821 - 29833
  • [38] Classification and Analysis of Pistachio Species with Pre-Trained Deep Learning Models
    Singh, Dilbag
    Taspinar, Yavuz Selim
    Kursun, Ramazan
    Cinar, Ilkay
    Koklu, Murat
    Ozkan, Ilker Ali
    Lee, Heung-No
    ELECTRONICS, 2022, 11 (07)
  • [39] Reinforced Curriculum Learning on Pre-Trained Neural Machine Translation Models
    Zhao, Mingjun
    Wu, Haijiang
    Niu, Di
    Wang, Xiaoli
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9652 - 9659
  • [40] Unsupervised Representation Learning from Pre-trained Diffusion Probabilistic Models
    Zhang, Zijian
    Zhao, Zhou
    Lin, Zhijie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,