Efficient Fine-Tuning of BERT Models on the Edge

被引：6

作者：

Vucetic, Danilo ^{[1
]}

Tayaranian, Mohammadreza ^{[1
]}

Ziaeefard, Maryam ^{[1
]}

Clark, James J. ^{[1
]}

Meyer, Brett H. ^{[1
]}

Gross, Warren J. ^{[1
]}

机构：

[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ, Canada

来源：

2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22) | 2022年

关键词：

Transformers; BERT; DistilBERT; NLP; Language Models; Efficient Transfer Learning; Efficient Fine-Tuning; Memory Efficiency; Time Efficiency; Edge Machine Learning;

D O I：

10.1109/ISCAS48785.2022.9937567

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Resource-constrained devices are increasingly the deployment targets of machine learning applications. Static models, however, do not always suffice for dynamic environments. On-device training of models allows for quick adaptability to new scenarios. With the increasing size of deep neural networks, as noted with the likes of BERT and other natural language processing models, comes increased resource requirements, namely memory, computation, energy, and time. Furthermore, training is far more resource intensive than inference. Resource-constrained on-device learning is thus doubly difficult, especially with large BERT-like models. By reducing the memory usage of fine-tuning, pre-trained BERT models can become efficient enough to fine-tune on resource-constrained devices. We propose Freeze And Reconfigure (FAR), a memory-efficient training regime for BERTlike models that reduces the memory usage of activation maps during fine-tuning by avoiding unnecessary parameter updates. FAR reduces fine-tuning time on the DistilBERT model and CoLA dataset by 30%, and time spent on memory operations by 47%. More broadly, reductions in metric performance on the GLUE and SQuAD datasets are around 1% on average.

引用

页码：1838 / 1842

页数：5

共 50 条

[21] Fine-Tuning BERT on Twitter and Reddit Data in Luganda and English
Kimera, Richard
Rim, Daniela N.
Choi, Heeyoul
PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2023, 2023, : 63 - 70
[22] Dual-Objective Fine-Tuning of BERT for Entity Matching
Peeters, Ralph
Bizer, Christian
PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (10): : 1913 - 1921
[23] Fine-Tuning BERT-Based Pre-Trained Models for Arabic Dependency Parsing
Al-Ghamdi, Sharefah
Al-Khalifa, Hend
Al-Salman, Abdulmalik
APPLIED SCIENCES-BASEL, 2023, 13 (07):
[24] LLAMAFACTORY: Unified Efficient Fine-Tuning of 100+Language Models
Zheng, Yaowei
Zhang, Richong
Zhang, Junhao
Ye, Yanhan
Luo, Zheyan
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3: SYSTEM DEMONSTRATIONS, 2024, : 400 - 410
[25] Democratizing protein language models with parameter-efficient fine-tuning
Sledzieski, Samuel
Kshirsagar, Meghana
Baek, Minkyung
Dodhia, Rahul
Ferres, Juan Lavista
Berger, Bonnie
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (26)
[26] Efficient and Versatile Robust Fine-Tuning of Zero-Shot Models
Kim, Sungyeon
Jeong, Boseung
Kim, Donghyun
Kwak, Suha
COMPUTER VISION - ECCV 2024, PT XVII, 2025, 15075 : 440 - 458
[27] Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models
Kang, Junmo
Xu, Wei
Ritter, Alan
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 11100 - 11119
[28] Federated Fine-Tuning Performance on Edge Devices
Orescanin, Marko
Ergezer, Mehmet
Singh, Gurminder
Baxter, Matthew
20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 1174 - 1181
[29] Fine-tuning constraints on supergravity models
Bastero-Gil, M
Kane, GL
King, SF
PHYSICS LETTERS B, 2000, 474 (1-2) : 103 - 112
[30] A Pairwise Probe for Understanding BERT Fine-Tuning on Machine Reading Comprehension
Cai, Jie
Zhu, Zhengzhou
Nie, Ping
Liu, Qian
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1665 - 1668

← 1 2 3 4 5 →