S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning

被引：0

作者：

Wang, Yabin ^{[1
,2
]}

Huang, Zhiwu ^{[2
]}

Hong, Xiaopeng ^{[1
,3
,4
]}

机构：

[1] Xi An Jiao Tong Univ, Xian, Peoples R China

[2] Singapore Management Univ, Singapore, Singapore

[3] Harbin Inst Technol, Harbin, Peoples R China

[4] Pengcheng Lab, Shenzhen, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022) | 2022年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

State-of-the-art deep neural networks are still struggling to address the catastrophic forgetting problem in continual learning. In this paper, we propose one simple paradigm (named as S-Prompting) and two concrete approaches to highly reduce the forgetting degree in one of the most typical continual learning scenarios, i.e., domain increment learning (DIL). The key idea of the paradigm is to learn prompts independently across domains with pre-trained transformers, avoiding the use of exemplars that commonly appear in conventional methods. This results in a win-win game where the prompting can achieve the best for each domain. The independent prompting across domains only requests one single cross-entropy loss for training and one simple K-NN operation as a domain identifier for inference. The learning paradigm derives an image prompt learning approach and a novel language-image prompt learning approach. Owning an excellent scalability (0.03% parameter increase per domain), the best of our approaches achieves a remarkable relative improvement (an average of about 30%) over the best of the state-of-the-art exemplar-free methods for three standard DIL tasks, and even surpasses the best of them relatively by about 6% in average when they use exemplars. Source code is available at https://github.com/iamwangyabin/S-Prompts.

引用

页数：14

共 50 条

[41] Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need
Zhou, Da-Wei
Cai, Zi-Wen
Ye, Han-Jia
Zhan, De-Chuan
Liu, Ziwei
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (03) : 1012 - 1032
[42] Domain Adaptation with Pre-trained Transformers for Query-Focused Abstractive Text Summarization
Laskar, Md Tahmid Rahman
Hoque, Enamul
Huang, Jimmy Xiangji
COMPUTATIONAL LINGUISTICS, 2022, 48 (02) : 279 - 320
[43] Backdoor Attacks Against Transfer Learning With Pre-Trained Deep Learning Models
Wang, Shuo
Nepal, Surya
Rudolph, Carsten
Grobler, Marthie
Chen, Shangyu
Chen, Tianle
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (03) : 1526 - 1539
[44] On learning functions from noise-free and noisy samples via occam's razor
Natarajan, B
SIAM JOURNAL ON COMPUTING, 2000, 29 (03) : 712 - 727
[45] Enhancing Alzheimer's Disease Classification with Transfer Learning: Fine-tuning a Pre-trained Algorithm
Boudi, Abdelmounim
He, Jingfei
Abd El Kader, Isselmou
CURRENT MEDICAL IMAGING, 2024,
[46] MULTI-DOMAIN MACHINE LEARNING APPROACH OF NAMED ENTITY RECOGNITION FOR ARABIC BOOKING CHATBOT ENGINES USING PRE-TRAINED BIDIRECTIONAL TRANSFORMERS
Sadder, Boshra
Sadder, Rahma
Abandah, Gheith
Jafar, Iyad
JORDANIAN JOURNAL OF COMPUTERS AND INFORMATION TECHNOLOGY, 2024, 10 (01): : 1 - 16
[47] Joint Learning of Pre-Trained and Random Units for Domain Adaptation in Part-of-Speech Tagging
Meftah, Sara
Tamaazousti, Youssef
Semmar, Nasredine
Essafi, Hassane
Sadat, Fatiha
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4107 - 4112
[48] Pre-trained Visual Dynamics Representations for Efficient Policy Learning
Luc, Hao
Zhou, Bohan
Lu, Zongqing
COMPUTER VISION - ECCV 2024, PT LXXXI, 2025, 15139 : 249 - 267
[49] RanPAC: Random Projections and Pre-trained Models for Continual Learning
McDonnell, Mark D.
Gong, Dong
Parveneh, Amin
Abbasnejad, Ehsan
van den Hengel, Anton
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[50] CODEEDITOR: Learning to Edit Source Code with Pre-trained Models
Li, Jia
Li, Ge
Li, Zhuo
Jin, Zhi
Hu, Xing
Zhang, Kechi
Fu, Zhiyi
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (06)

← 1 2 3 4 5 →