S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning

被引:0
|
作者
Wang, Yabin [1 ,2 ]
Huang, Zhiwu [2 ]
Hong, Xiaopeng [1 ,3 ,4 ]
机构
[1] Xi An Jiao Tong Univ, Xian, Peoples R China
[2] Singapore Management Univ, Singapore, Singapore
[3] Harbin Inst Technol, Harbin, Peoples R China
[4] Pengcheng Lab, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
State-of-the-art deep neural networks are still struggling to address the catastrophic forgetting problem in continual learning. In this paper, we propose one simple paradigm (named as S-Prompting) and two concrete approaches to highly reduce the forgetting degree in one of the most typical continual learning scenarios, i.e., domain increment learning (DIL). The key idea of the paradigm is to learn prompts independently across domains with pre-trained transformers, avoiding the use of exemplars that commonly appear in conventional methods. This results in a win-win game where the prompting can achieve the best for each domain. The independent prompting across domains only requests one single cross-entropy loss for training and one simple K-NN operation as a domain identifier for inference. The learning paradigm derives an image prompt learning approach and a novel language-image prompt learning approach. Owning an excellent scalability (0.03% parameter increase per domain), the best of our approaches achieves a remarkable relative improvement (an average of about 30%) over the best of the state-of-the-art exemplar-free methods for three standard DIL tasks, and even surpasses the best of them relatively by about 6% in average when they use exemplars. Source code is available at https://github.com/iamwangyabin/S-Prompts.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Unsupervised Out-of-Domain Detection via Pre-trained Transformers
    Xu, Keyang
    Ren, Tongzheng
    Zhang, Shikun
    Feng, Yihao
    Xiong, Caiming
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1052 - 1061
  • [22] Simple Models in Complex Worlds: Occam's Razor and Statistical Learning Theory
    Bargagli Stoffi, Falco J.
    Cevolani, Gustavo
    Gnecco, Giorgio
    MINDS AND MACHINES, 2022, 32 (01) : 13 - 42
  • [23] Simple Models in Complex Worlds: Occam’s Razor and Statistical Learning Theory
    Falco J. Bargagli Stoffi
    Gustavo Cevolani
    Giorgio Gnecco
    Minds and Machines, 2022, 32 : 13 - 42
  • [24] Learning to Modulate pre-trained Models in RL
    Schmied, Thomas
    Hofmarcher, Markus
    Paischer, Fabian
    Pascanu, Razvan
    Hochreiter, Sepp
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [25] Continual Learning with Pre-Trained Models: A Survey
    Zhou, Da-Wei
    Sun, Hai-Long
    Ning, Jingyi
    Ye, Han-Jia
    Zhan, De-Chuan
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 8363 - 8371
  • [26] Detection of Alzheimer's disease using pre-trained deep learning models through transfer learning: a review
    Heenaye-Mamode Khan, Maleika
    Reesaul, Pushtika
    Auzine, Muhammad Muzzammil
    Taylor, Amelia
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (10)
  • [27] Pashto poetry generation: deep learning with pre-trained transformers for low-resource languages
    Ullah, Imran
    Ullah, Khalil
    Khan, Hamad
    Aurangzeb, Khursheed
    Anwar, Muhammad Shahid
    Syed, Ikram
    PeerJ Computer Science, 2024, 10 : 1 - 23
  • [28] Pashto poetry generation: deep learning with pre-trained transformers for low-resource languages
    Ullah, Imran
    Ullah, Khalil
    Khan, Hamad
    Aurangzeb, Khursheed
    Anwar, Muhammad Shahid
    Syed, Ikram
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [29] Budget Restricted Incremental Learning with Pre-Trained Convolutional Neural Networks and Binary Associative Memories
    Hacene, Ghouthi Boukli
    Gripon, Vincent
    Farrugia, Nicolas
    Arzel, Matthieu
    Jezequel, Michel
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2019, 91 (09): : 1063 - 1073
  • [30] Budget Restricted Incremental Learning with Pre-Trained Convolutional Neural Networks and Binary Associative Memories
    Hacene, Ghouthi Boukli
    Gripon, Vincent
    Farrugia, Nicolas
    Arzel, Matthieu
    Jezequel, Michel
    2017 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2017,