S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning

被引:0
|
作者
Wang, Yabin [1 ,2 ]
Huang, Zhiwu [2 ]
Hong, Xiaopeng [1 ,3 ,4 ]
机构
[1] Xi An Jiao Tong Univ, Xian, Peoples R China
[2] Singapore Management Univ, Singapore, Singapore
[3] Harbin Inst Technol, Harbin, Peoples R China
[4] Pengcheng Lab, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
State-of-the-art deep neural networks are still struggling to address the catastrophic forgetting problem in continual learning. In this paper, we propose one simple paradigm (named as S-Prompting) and two concrete approaches to highly reduce the forgetting degree in one of the most typical continual learning scenarios, i.e., domain increment learning (DIL). The key idea of the paradigm is to learn prompts independently across domains with pre-trained transformers, avoiding the use of exemplars that commonly appear in conventional methods. This results in a win-win game where the prompting can achieve the best for each domain. The independent prompting across domains only requests one single cross-entropy loss for training and one simple K-NN operation as a domain identifier for inference. The learning paradigm derives an image prompt learning approach and a novel language-image prompt learning approach. Owning an excellent scalability (0.03% parameter increase per domain), the best of our approaches achieves a remarkable relative improvement (an average of about 30%) over the best of the state-of-the-art exemplar-free methods for three standard DIL tasks, and even surpasses the best of them relatively by about 6% in average when they use exemplars. Source code is available at https://github.com/iamwangyabin/S-Prompts.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need
    Zhou, Da-Wei
    Cai, Zi-Wen
    Ye, Han-Jia
    Zhan, De-Chuan
    Liu, Ziwei
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (03) : 1012 - 1032
  • [42] Domain Adaptation with Pre-trained Transformers for Query-Focused Abstractive Text Summarization
    Laskar, Md Tahmid Rahman
    Hoque, Enamul
    Huang, Jimmy Xiangji
    COMPUTATIONAL LINGUISTICS, 2022, 48 (02) : 279 - 320
  • [43] Backdoor Attacks Against Transfer Learning With Pre-Trained Deep Learning Models
    Wang, Shuo
    Nepal, Surya
    Rudolph, Carsten
    Grobler, Marthie
    Chen, Shangyu
    Chen, Tianle
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (03) : 1526 - 1539
  • [44] On learning functions from noise-free and noisy samples via occam's razor
    Natarajan, B
    SIAM JOURNAL ON COMPUTING, 2000, 29 (03) : 712 - 727
  • [45] Enhancing Alzheimer's Disease Classification with Transfer Learning: Fine-tuning a Pre-trained Algorithm
    Boudi, Abdelmounim
    He, Jingfei
    Abd El Kader, Isselmou
    CURRENT MEDICAL IMAGING, 2024,
  • [46] MULTI-DOMAIN MACHINE LEARNING APPROACH OF NAMED ENTITY RECOGNITION FOR ARABIC BOOKING CHATBOT ENGINES USING PRE-TRAINED BIDIRECTIONAL TRANSFORMERS
    Sadder, Boshra
    Sadder, Rahma
    Abandah, Gheith
    Jafar, Iyad
    JORDANIAN JOURNAL OF COMPUTERS AND INFORMATION TECHNOLOGY, 2024, 10 (01): : 1 - 16
  • [47] Joint Learning of Pre-Trained and Random Units for Domain Adaptation in Part-of-Speech Tagging
    Meftah, Sara
    Tamaazousti, Youssef
    Semmar, Nasredine
    Essafi, Hassane
    Sadat, Fatiha
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4107 - 4112
  • [48] Pre-trained Visual Dynamics Representations for Efficient Policy Learning
    Luc, Hao
    Zhou, Bohan
    Lu, Zongqing
    COMPUTER VISION - ECCV 2024, PT LXXXI, 2025, 15139 : 249 - 267
  • [49] RanPAC: Random Projections and Pre-trained Models for Continual Learning
    McDonnell, Mark D.
    Gong, Dong
    Parveneh, Amin
    Abbasnejad, Ehsan
    van den Hengel, Anton
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [50] CODEEDITOR: Learning to Edit Source Code with Pre-trained Models
    Li, Jia
    Li, Ge
    Li, Zhuo
    Jin, Zhi
    Hu, Xing
    Zhang, Kechi
    Fu, Zhiyi
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (06)