Neural Architecture Search for Parameter-Efficient Fine-tuning of Large Pre-trained Language Models

被引：0

作者：

Lawton, Neal ^{[1
]}

Kumar, Anoop ^{[2
]}

Thattai, Govind ^{[2
]}

Galstyan, Aram ^{[2
]}

Ver Steeg, Greg ^{[2
]}

机构：

[1] Informat Sci Inst, Marina Del Rey, CA 90292 USA

[2] Amazon Alexa AI, Redmond, WA USA

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Parameter-efficient tuning (PET) methods fit pre-trained language models (PLMs) to downstream tasks by either computing a small compressed update for a subset of model parameters, or appending and fine-tuning a small number of new model parameters to the pre-trained network. Hand-designed PET architectures from the literature perform well in practice, but have the potential to be improved via automated neural architecture search (NAS). We propose an efficient NAS method for learning PET architectures via structured and unstructured pruning. We present experiments on GLUE demonstrating the effectiveness of our algorithm and discuss how PET architectural design choices affect performance in practice.

引用

页码：8506 / 8515

页数：10

共 50 条

[31] Parameter-Efficient Tuning for Object Tracking by Migrating Pre-Trained Decoders
Zhang, Ruijuan
Wang, Li
Yang, Song
ELECTRONICS, 2024, 13 (23):
[32] GamMa: Efficient Fine-Tuning of Pre-Trained Language Models Using Gradient Activation Mapping Masking
Gui, Anchun
Ye, Jinqiang
Xiao, Han
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[33] Parameter-Efficient Fine-Tuning of Large Pretrained Models for Instance Segmentation Tasks
Baker, Nermeen Abou
Rohrschneider, David
Handmann, Uwe
MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (04): : 2783 - 2807
[34] Empirical study on fine-tuning pre-trained large language models for fault diagnosis of complex systems
Zheng, Shuwen
Pan, Kai
Liu, Jie
Chen, Yunxia
RELIABILITY ENGINEERING & SYSTEM SAFETY, 2024, 252
[35] Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding
Shimomoto, Erica K.
Marrese-Taylor, Edison
Takamur, Hiroya
Kobayashi, Ichiro
Nakayama, Hideki
Miyao, Yusuke
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 13101 - 13123
[36] Reparameterization-Based Parameter-Efficient Fine-Tuning Methods for Large Language Models: A Systematic Survey
Chen, Zezhou
Liu, Zhaoxiang
Wang, Kai
Lian, Shiguo
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 107 - 118
[37] Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning
Liao, Baohao
Tan, Shaomu
Monz, Christof
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[38] Towards Fine-tuning Pre-trained Language Models with Integer Forward and Backward Propagation
Tayaranian, Mohammadreza
Ghaffari, Alireza
Tahaei, Marzieh S.
Rezagholizadeh, Mehdi
Asgharian, Masoud
Nia, Vahid Partovi
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1912 - 1921
[39] Towards Anytime Fine-tuning: Continually Pre-trained Language Models with Hypernetwork Prompts
Jiang, Gangwei
Jiang, Caigao
Xue, Sigiao
Zhang, James Y.
Zhou, Jun
Lian, Defu
Wei, Ying
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 12081 - 12095
[40] Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction
Alt, Christoph
Huebner, Marc
Hennig, Leonhard
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1388 - 1398

← 1 2 3 4 5 →