Task-Free Dynamic Sparse Vision Transformer for Continual Learning

被引:0
|
作者
Ye, Fei [1 ,2 ]
Bors, Adrian G. [1 ,2 ]
机构
[1] Univ York, Dept Comp Sci, York YO10 5GH, N Yorkshire, England
[2] Mohamed Bin Zayed Univ Artificial Intelligence, Machine Learning Dept, Abu Dhabi, U Arab Emirates
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vision Transformers (ViTs) represent self-attention-based network backbones shown to be efficient in many individual tasks, but which have not been explored in Task-Free Continual Learning (TFCL) so far. Most existing ViT-based approaches for Continual Learning (CL) are relying on task information. In this study, we explore the advantages of the ViT in a more challenging CL scenario where the task boundaries are unavailable during training. To address this learning paradigm, we propose the Task-Free Dynamic Sparse Vision Transformer (TFDSViT), which can dynamically build new sparse experts, where each expert leverages sparsity to allocate the model's capacity for capturing different information categories over time. To avoid forgetting and ensure efficiency in reusing the previously learned knowledge in subsequent learning, we propose a new dynamic dual attention mechanism consisting of the Sparse Attention (SA') and Knowledge Transfer Attention (KTA) modules. The SA' refrains from updating some previously learned attention blocks for preserving prior knowledge. The KTA uses and regulates the information flow of all previously learned experts for learning new patterns. The proposed dual attention mechanism can simultaneously relieve forgetting and promote knowledge transfer for a dynamic expansion model in a task-free manner. We also propose an energy-based dynamic expansion mechanism using the energy as a measure of novelty for the incoming samples which provides appropriate expansion signals leading to a compact network architecture for TFDSViT. Extensive empirical studies demonstrate the effectiveness of TFDSViT.
引用
收藏
页码:16442 / 16450
页数:9
相关论文
共 50 条
  • [1] Online Task-free Continual Learning with Dynamic Sparse Distributed Memory
    Pourcel, Julien
    Ngoc-Son Vu
    French, Robert M.
    [J]. COMPUTER VISION, ECCV 2022, PT XXV, 2022, 13685 : 739 - 756
  • [2] Task-Free Continual Learning
    Aljundi, Rahaf
    Kelchtermans, Klaas
    Tuytelaars, Tinne
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11246 - 11255
  • [3] eSelf-Evolved Dynamic Expansion Model for Task-Free Continual Learning
    Ye, Fei
    Bors, Adrian G.
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22045 - 22055
  • [4] LEARNING AN EVOLVED MIXTURE MODEL FOR TASK-FREE CONTINUAL LEARNING
    Ye, Fei
    Bors, Adrian G.
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1936 - 1940
  • [5] Task-Free Continual Generation and Representation Learning via Dynamic Expansionable Memory Cluster
    Ye, Fei
    Bors, Adrian G.
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 15, 2024, : 16451 - 16459
  • [6] Online industrial fault prognosis in dynamic environments via task-free continual learning
    Liu, Chongdang
    Zhang, Linxuan
    Zheng, Yimeng
    Jiang, Zhengyi
    Zheng, Jinghao
    Wu, Cheng
    [J]. NEUROCOMPUTING, 2024, 598
  • [7] Task-Free Continual Learning via Online Discrepancy Distance Learning
    Ye, Fei
    Bors, Adrian G.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] Improving Task-free Continual Learning by Distributionally Robust Memory Evolution
    Wang, Zhenyi
    Shen, Li
    Fang, Le
    Suo, Qiuling
    Duan, Tiehang
    Gao, Mingchen
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [9] Similarity-Based Adaptation for Task-Aware and Task-Free Continual Learning
    Adel, Tameem
    [J]. Journal of Artificial Intelligence Research, 2024, 80 : 377 - 417
  • [10] Similarity-Based Adaptation for Task-Aware and Task-Free Continual Learning
    Adel, Tameem
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 80 : 377 - 417