Efficient Fine-Tuning of BERT Models on the Edge

被引:3
|
作者
Vucetic, Danilo [1 ]
Tayaranian, Mohammadreza [1 ]
Ziaeefard, Maryam [1 ]
Clark, James J. [1 ]
Meyer, Brett H. [1 ]
Gross, Warren J. [1 ]
机构
[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ, Canada
关键词
Transformers; BERT; DistilBERT; NLP; Language Models; Efficient Transfer Learning; Efficient Fine-Tuning; Memory Efficiency; Time Efficiency; Edge Machine Learning;
D O I
10.1109/ISCAS48785.2022.9937567
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Resource-constrained devices are increasingly the deployment targets of machine learning applications. Static models, however, do not always suffice for dynamic environments. On-device training of models allows for quick adaptability to new scenarios. With the increasing size of deep neural networks, as noted with the likes of BERT and other natural language processing models, comes increased resource requirements, namely memory, computation, energy, and time. Furthermore, training is far more resource intensive than inference. Resource-constrained on-device learning is thus doubly difficult, especially with large BERT-like models. By reducing the memory usage of fine-tuning, pre-trained BERT models can become efficient enough to fine-tune on resource-constrained devices. We propose Freeze And Reconfigure (FAR), a memory-efficient training regime for BERTlike models that reduces the memory usage of activation maps during fine-tuning by avoiding unnecessary parameter updates. FAR reduces fine-tuning time on the DistilBERT model and CoLA dataset by 30%, and time spent on memory operations by 47%. More broadly, reductions in metric performance on the GLUE and SQuAD datasets are around 1% on average.
引用
收藏
页码:1838 / 1842
页数:5
相关论文
共 50 条
  • [1] Transfer fine-tuning of BERT with phrasal paraphrases
    Arase, Yuki
    Tsujii, Junichi
    [J]. COMPUTER SPEECH AND LANGUAGE, 2021, 66
  • [2] Transfer Fine-Tuning: A BERT Case Study
    Arase, Yuki
    Tsujii, Junichi
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5393 - 5404
  • [3] SPEECH RECOGNITION BY SIMPLY FINE-TUNING BERT
    Huang, Wen-Chin
    Wu, Chia-Hua
    Luo, Shang-Bao
    Chen, Kuan-Yu
    Wang, Hsin-Min
    Toda, Tomoki
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7343 - 7347
  • [4] Investigating Learning Dynamics of BERT Fine-Tuning
    Hao, Yaru
    Dong, Li
    Wei, Furu
    Xu, Ke
    [J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 87 - 92
  • [5] How fine can fine-tuning be? Learning efficient language models
    Radiya-Dixit, Evani
    Wang, Xin
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2435 - 2442
  • [6] Patent classification by fine-tuning BERT language model
    Lee, Jieh-Sheng
    Hsiang, Jieh
    [J]. WORLD PATENT INFORMATION, 2020, 61
  • [7] Fine-Tuning BERT for Generative Dialogue Domain Adaptation
    Labruna, Tiziano
    Magnini, Bernardo
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 513 - 524
  • [8] Dataset Distillation with Attention Labels for Fine-tuning BERT
    Maekawa, Aru
    Kobayashi, Naoki
    Funakoshi, Kotaro
    Okumura, Manabu
    [J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 119 - 127
  • [9] Noise Stability Regularization for Improving BERT Fine-tuning
    Hua, Hang
    Li, Xingjian
    Dou, Dejing
    Xu, Chengzhong
    Luo, Jiebo
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3229 - 3241
  • [10] IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization
    Zhou, Wenxuan
    Lin, Bill Yuchen
    Ren, Xiang
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14621 - 14629