Automated Progressive Learning for Efficient Training of Vision Transformers

被引:10
|
作者
Li, Changlin [1 ,2 ,3 ]
Zhuang, Bohan [3 ]
Wang, Guangrun [4 ]
Liang, Xiaodan [5 ]
Chang, Xiaojun [2 ]
Yang, Yi [6 ]
机构
[1] Baidu Res, Beijing, Peoples R China
[2] Univ Technol Sydney, AAII, ReLER, Sydney, NSW, Australia
[3] Monash Univ, Clayton, Vic, Australia
[4] Univ Oxford, Oxford, England
[5] Sun Yat Sen Univ, Guangzhou, Guangdong, Peoples R China
[6] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
基金
澳大利亚研究理事会; 国家重点研发计划;
关键词
D O I
10.1109/CVPR52688.2022.01216
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances in vision Transformers (ViTs) have come with a voracious appetite for computing power, highlighting the urgent need to develop efficient training methods for ViTs. Progressive learning, a training scheme where the model capacity grows progressively during training, has started showing its ability in efficient training. In this paper, we take a practical step towards efficient training of ViTs by customizing and automating progressive learning. First, we develop a strong manual baseline for progressive learning of ViTs, by introducing momentum growth (MoGrow) to bridge the gap brought by model growth. Then, we propose automated progressive learning (AutoProg), an efficient training scheme that aims to achieve lossless acceleration by automatically increasing the training overload on-the-fly; this is achieved by adaptively deciding whether, where and how much should the model grow during progressive learning. Specifically, we first relax the optimization of the growth schedule to sub-network architecture optimization problem, then propose one-shot estimation of the sub-network performance via an elastic supernet. The searching overhead is reduced to minimal by recycling the parameters of the supernet. Extensive experiments of efficient training on ImageNet with two representative ViT models, DeiT and VOLO, demonstrate that AutoProg can accelerate ViTs training by up to 85.1% with no performance drop.(1)
引用
收藏
页码:12476 / 12486
页数:11
相关论文
共 50 条
  • [21] Evaluation of FractalDB Pre-training with Vision Transformers
    Nakashima K.
    Kataoka H.
    Satoh Y.
    [J]. Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering, 2023, 89 (01): : 99 - 104
  • [22] AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
    Meng, Lingchen
    Li, Hengduo
    Chen, Bor-Chun
    Lan, Shiyi
    Wu, Zuxuan
    Jiang, Yu-Gang
    Lim, Ser-Nam
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12299 - 12308
  • [23] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
    Rao, Yongming
    Zhao, Wenliang
    Liu, Benlin
    Lu, Jiwen
    Zhou, Jie
    Hsieh, Cho-Jui
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [24] DiffRate : Differentiable Compression Rate for Efficient Vision Transformers
    Chen, Mengzhao
    Shao, Wenqi
    Xu, Peng
    Lin, Mingbao
    Zhang, Kaipeng
    Chao, Fei
    Ji, Rongrong
    Qiao, Yu
    Luo, Ping
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17118 - 17128
  • [25] Parameter-Efficient Model Adaptation for Vision Transformers
    He, Xuehai
    Li, Chuanyuan
    Zhang, Pengchuan
    Yang, Jianwei
    Wang, Xin Eric
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 817 - 825
  • [26] Efficient Training of Visual Transformers with Small Datasets
    Liu, Yahui
    Sangineto, Enver
    Bi, Wei
    Sebe, Nicu
    Lepri, Bruno
    De Nadai, Marco
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [27] Memory Efficient Continual Learning with Transformers
    Ermis, Beyza
    Zappella, Giovanni
    Wistuba, Martin
    Rawal, Aditya
    Archambeau, Cedric
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [28] When Adversarial Training Meets Vision Transformers: Recipes from Training to Architecture
    Mo, Yichuan
    Wu, Dongxian
    Wang, Yifei
    Guo, Yiwen
    Wang, Yisen
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [29] Weeds Classification with Deep Learning: An Investigation Using CNN, Vision Transformers, Pyramid Vision Transformers, and Ensemble Strategy
    Rozendo, Guilherme Botazzo
    Roberto, Guilherme Freire
    Zanchetta do Nascimento, Marcelo
    Neves, Leandro Alves
    Lumini, Alessandra
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I, 2024, 14469 : 229 - 243
  • [30] An Empirical Study of Training Self-Supervised Vision Transformers
    Chen, Xinlei
    Xie, Saining
    He, Kaiming
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9620 - 9629