Neural Architecture Search for Parameter-Efficient Fine-tuning of Large Pre-trained Language Models

被引:0
|
作者
Lawton, Neal [1 ]
Kumar, Anoop [2 ]
Thattai, Govind [2 ]
Galstyan, Aram [2 ]
Ver Steeg, Greg [2 ]
机构
[1] Informat Sci Inst, Marina Del Rey, CA 90292 USA
[2] Amazon Alexa AI, Redmond, WA USA
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Parameter-efficient tuning (PET) methods fit pre-trained language models (PLMs) to downstream tasks by either computing a small compressed update for a subset of model parameters, or appending and fine-tuning a small number of new model parameters to the pre-trained network. Hand-designed PET architectures from the literature perform well in practice, but have the potential to be improved via automated neural architecture search (NAS). We propose an efficient NAS method for learning PET architectures via structured and unstructured pruning. We present experiments on GLUE demonstrating the effectiveness of our algorithm and discuss how PET architectural design choices affect performance in practice.
引用
收藏
页码:8506 / 8515
页数:10
相关论文
共 50 条
  • [31] Parameter-Efficient Tuning for Object Tracking by Migrating Pre-Trained Decoders
    Zhang, Ruijuan
    Wang, Li
    Yang, Song
    ELECTRONICS, 2024, 13 (23):
  • [32] GamMa: Efficient Fine-Tuning of Pre-Trained Language Models Using Gradient Activation Mapping Masking
    Gui, Anchun
    Ye, Jinqiang
    Xiao, Han
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [33] Parameter-Efficient Fine-Tuning of Large Pretrained Models for Instance Segmentation Tasks
    Baker, Nermeen Abou
    Rohrschneider, David
    Handmann, Uwe
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (04): : 2783 - 2807
  • [34] Empirical study on fine-tuning pre-trained large language models for fault diagnosis of complex systems
    Zheng, Shuwen
    Pan, Kai
    Liu, Jie
    Chen, Yunxia
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2024, 252
  • [35] Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding
    Shimomoto, Erica K.
    Marrese-Taylor, Edison
    Takamur, Hiroya
    Kobayashi, Ichiro
    Nakayama, Hideki
    Miyao, Yusuke
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 13101 - 13123
  • [36] Reparameterization-Based Parameter-Efficient Fine-Tuning Methods for Large Language Models: A Systematic Survey
    Chen, Zezhou
    Liu, Zhaoxiang
    Wang, Kai
    Lian, Shiguo
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 107 - 118
  • [37] Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning
    Liao, Baohao
    Tan, Shaomu
    Monz, Christof
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [38] Towards Fine-tuning Pre-trained Language Models with Integer Forward and Backward Propagation
    Tayaranian, Mohammadreza
    Ghaffari, Alireza
    Tahaei, Marzieh S.
    Rezagholizadeh, Mehdi
    Asgharian, Masoud
    Nia, Vahid Partovi
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1912 - 1921
  • [39] Towards Anytime Fine-tuning: Continually Pre-trained Language Models with Hypernetwork Prompts
    Jiang, Gangwei
    Jiang, Caigao
    Xue, Sigiao
    Zhang, James Y.
    Zhou, Jun
    Lian, Defu
    Wei, Ying
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 12081 - 12095
  • [40] Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction
    Alt, Christoph
    Huebner, Marc
    Hennig, Leonhard
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1388 - 1398