Automated Progressive Learning for Efficient Training of Vision Transformers

被引:10
|
作者
Li, Changlin [1 ,2 ,3 ]
Zhuang, Bohan [3 ]
Wang, Guangrun [4 ]
Liang, Xiaodan [5 ]
Chang, Xiaojun [2 ]
Yang, Yi [6 ]
机构
[1] Baidu Res, Beijing, Peoples R China
[2] Univ Technol Sydney, AAII, ReLER, Sydney, NSW, Australia
[3] Monash Univ, Clayton, Vic, Australia
[4] Univ Oxford, Oxford, England
[5] Sun Yat Sen Univ, Guangzhou, Guangdong, Peoples R China
[6] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
基金
澳大利亚研究理事会; 国家重点研发计划;
关键词
D O I
10.1109/CVPR52688.2022.01216
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances in vision Transformers (ViTs) have come with a voracious appetite for computing power, highlighting the urgent need to develop efficient training methods for ViTs. Progressive learning, a training scheme where the model capacity grows progressively during training, has started showing its ability in efficient training. In this paper, we take a practical step towards efficient training of ViTs by customizing and automating progressive learning. First, we develop a strong manual baseline for progressive learning of ViTs, by introducing momentum growth (MoGrow) to bridge the gap brought by model growth. Then, we propose automated progressive learning (AutoProg), an efficient training scheme that aims to achieve lossless acceleration by automatically increasing the training overload on-the-fly; this is achieved by adaptively deciding whether, where and how much should the model grow during progressive learning. Specifically, we first relax the optimization of the growth schedule to sub-network architecture optimization problem, then propose one-shot estimation of the sub-network performance via an elastic supernet. The searching overhead is reduced to minimal by recycling the parameters of the supernet. Extensive experiments of efficient training on ImageNet with two representative ViT models, DeiT and VOLO, demonstrate that AutoProg can accelerate ViTs training by up to 85.1% with no performance drop.(1)
引用
收藏
页码:12476 / 12486
页数:11
相关论文
共 50 条
  • [1] Towards Efficient Adversarial Training on Vision Transformers
    Wu, Boxi
    Gu, Jindong
    Li, Zhifeng
    Cai, Deng
    He, Xiaofei
    Liu, Wei
    [J]. COMPUTER VISION, ECCV 2022, PT XIII, 2022, 13673 : 307 - 325
  • [2] Ensembles of data-efficient vision transformers as a new paradigm for automated classification in ecology
    S. P. Kyathanahally
    T. Hardeman
    M. Reyes
    E. Merz
    T. Bulas
    P. Brun
    F. Pomati
    M. Baity-Jesi
    [J]. Scientific Reports, 12
  • [3] Ensembles of data-efficient vision transformers as a new paradigm for automated classification in ecology
    Kyathanahally, S. P.
    Hardeman, T.
    Reyes, M.
    Merz, E.
    Bulas, T.
    Brun, P.
    Pomati, F.
    Baity-Jesi, M.
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [4] Efficient continual learning at the edge with progressive segmented training
    Du, Xiaocong
    Venkataramanaiah, Shreyas Kolala
    Li, Zheng
    Suh, Han-Sok
    Yin, Shihui
    Krishnan, Gokul
    Liu, Frank
    Seo, Jae-sun
    Cao, Yu
    [J]. NEUROMORPHIC COMPUTING AND ENGINEERING, 2022, 2 (04):
  • [5] Training Vision Transformers in Federated Learning with Limited Edge-Device Resources
    Tao, Jiang
    Gao, Zhen
    Guo, Zhaohui
    [J]. ELECTRONICS, 2022, 11 (17)
  • [6] Automated Aircraft Recognition via Vision Transformers
    Huo, Yintong
    Peng, Yun
    Lyu, Michael
    [J]. 2023 IEEE AEROSPACE CONFERENCE, 2023,
  • [7] Patch Slimming for Efficient Vision Transformers
    Tang, Yehui
    Han, Kai
    Wang, Yunhe
    Xu, Chang
    Guo, Jianyuan
    Xu, Chao
    Tao, Dacheng
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12155 - 12164
  • [8] A Survey on Efficient Training of Transformers
    Zhuang, Bohan
    Liu, Jing
    Pan, Zizheng
    He, Haoyu
    Weng, Yuetian
    Shen, Chunhua
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6823 - 6831
  • [9] Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
    Wei, Cong
    Duke, Brendan
    Jiang, Ruowei
    Aarabi, Parham
    Taylor, Graham W.
    Shkurti, Florian
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22680 - 22689
  • [10] Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation
    Hao, Zhiwei
    Guo, Jianyuan
    Jia, Ding
    Han, Kai
    Tang, Yehui
    Zhang, Chao
    Hu, Han
    Wang, Yunhe
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,