S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning

被引:0
|
作者
Wang, Yabin [1 ,2 ]
Huang, Zhiwu [2 ]
Hong, Xiaopeng [1 ,3 ,4 ]
机构
[1] Xi An Jiao Tong Univ, Xian, Peoples R China
[2] Singapore Management Univ, Singapore, Singapore
[3] Harbin Inst Technol, Harbin, Peoples R China
[4] Pengcheng Lab, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
State-of-the-art deep neural networks are still struggling to address the catastrophic forgetting problem in continual learning. In this paper, we propose one simple paradigm (named as S-Prompting) and two concrete approaches to highly reduce the forgetting degree in one of the most typical continual learning scenarios, i.e., domain increment learning (DIL). The key idea of the paradigm is to learn prompts independently across domains with pre-trained transformers, avoiding the use of exemplars that commonly appear in conventional methods. This results in a win-win game where the prompting can achieve the best for each domain. The independent prompting across domains only requests one single cross-entropy loss for training and one simple K-NN operation as a domain identifier for inference. The learning paradigm derives an image prompt learning approach and a novel language-image prompt learning approach. Owning an excellent scalability (0.03% parameter increase per domain), the best of our approaches achieves a remarkable relative improvement (an average of about 30%) over the best of the state-of-the-art exemplar-free methods for three standard DIL tasks, and even surpasses the best of them relatively by about 6% in average when they use exemplars. Source code is available at https://github.com/iamwangyabin/S-Prompts.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning
    Zhou, Da-Wei
    Sun, Hai-Long
    Ye, Han-Jia
    Zhan, De-Chuan
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 23554 - 23564
  • [32] Budget Restricted Incremental Learning with Pre-Trained Convolutional Neural Networks and Binary Associative Memories
    Ghouthi Boukli Hacene
    Vincent Gripon
    Nicolas Farrugia
    Matthieu Arzel
    Michel Jezequel
    Journal of Signal Processing Systems, 2019, 91 : 1063 - 1073
  • [33] An Occam's Razor View on Learning Audiovisual Emotion Recognition with Small Training Sets
    Vielzeuf, Valentin
    Kervadec, Corentin
    Pateux, Stephane
    Lechervy, Alexis
    Jurie, Frederic
    ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 589 - 593
  • [34] Federated Learning from Pre-Trained Models: A Contrastive Learning Approach
    Tan, Yue
    Long, Guodong
    Ma, Jie
    Liu, Lu
    Zhou, Tianyi
    Jiang, Jing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [35] On the Validity of Pre-Trained Transformers for Natural Language Processing in the Software Engineering Domain
    von der Mosel, Julian
    Trautsch, Alexander
    Herbold, Steffen
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (04) : 1487 - 1507
  • [36] Structure learning and the Occam's razor principle: a new view of human function acquisition
    Narain, Devika
    Smeets, Jeroen B. J.
    Mamassian, Pascal
    Brenner, Eli
    van Beers, Robert J.
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2014, 8
  • [37] PTMA: Pre-trained Model Adaptation for Transfer Learning
    Li, Xiao
    Yan, Junkai
    Jiang, Jianjian
    Zheng, Wei-Shi
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, KSEM 2024, 2024, 14884 : 176 - 188
  • [38] Towards Inadequately Pre-trained Models in Transfer Learning
    Deng, Andong
    Li, Xingjian
    Hu, Di
    Wang, Tianyang
    Xiong, Haoyi
    Xu, Cheng-Zhong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19340 - 19351
  • [39] Transfer learning with pre-trained conditional generative models
    Yamaguchi, Shin'ya
    Kanai, Sekitoshi
    Kumagai, Atsutoshi
    Chijiwa, Daiki
    Kashima, Hisashi
    MACHINE LEARNING, 2025, 114 (04)
  • [40] Deep Learning-based POS Tagger and Chunker for Odia Language Using Pre-trained Transformers
    Dalai, Tusarkanta
    Kumarmishra, Tapas
    Sa, Andpankaj K.
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (02)