HiFi: High-Information Attention Heads Hold for Parameter-Efficient Model Adaptation

被引:0
|
作者
Gui, Anchun [1 ]
Xiao, Han [1 ]
机构
[1] Xiamen Univ, Sch Informat, Dept Artificial Intelligence, Xiamen, Peoples R China
来源
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1 | 2023年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To fully leverage the advantages of large-scale pre-trained language models (PLMs) on downstream tasks, it has become a ubiquitous adaptation paradigm to fine-tune the entire parameters of PLMs. However, this paradigm poses issues of inefficient updating and resource over-consuming for fine-tuning in data-scarce and resource-limited scenarios, because of the large scale of parameters in PLMs. To alleviate these concerns, in this paper, we propose a parameter-efficient fine-tuning method HiFi, that is, only the highly informative and strongly correlated attention heads for the specific task are fine-tuned. To search for those significant attention heads, we develop a novel framework to analyze the effectiveness of heads. Specifically, we first model the relationship between heads into a graph from two perspectives of information richness and correlation, and then apply PageRank algorithm to determine the relative importance of each head. Extensive experiments on the GLUE benchmark demonstrate the effectiveness of our method, and show that HiFi obtains state-of-the-art performance over the prior baselines.
引用
收藏
页码:8521 / 8537
页数:17
相关论文
共 50 条
  • [31] OpenDelta: A Plug-and -play Library for Parameter-efficient Adaptation of Pre-trained Models
    Hu, Shengding
    Ding, Ning
    Zhao, Weilin
    Lv, Xingtai
    Zhang, Zhen
    Liu, Zhiyuan
    Sun, Maosong
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-DEMO 2023, VOL 3, 2023, : 274 - 281
  • [32] Pit One Against Many: Leveraging Attention-head Embeddings for Parameter-efficient Multi-head Attention
    Xue, Huiyin
    Aletras, Nikolaos
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 10355 - 10373
  • [33] Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning
    Lin, Zhaojiang
    Madotto, Andrea
    Fung, Pascale
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 441 - 459
  • [34] Parameter-Efficient Fine-Tuning Large Speech Model Based on LoRA
    Ou, Ling
    Feng, Gen
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 36 - 41
  • [35] Parameter-efficient Tuning for Large Language Model without Calculating Its Gradients
    Jin, Feihu
    Zhang, Jiajun
    Zong, Chengqing
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 321 - 330
  • [36] CAUSAL-STORY: LOCAL CAUSAL ATTENTION UTILIZING PARAMETER-EFFICIENT TUNING FOR VISUAL STORY SYNTHESIS
    Song, Tianyi
    Cao, Jiuxin
    Wang, Kun
    Liu, Bo
    Zhang, Xiaofeng
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 3350 - 3354
  • [37] Parameter-efficient adaptation with multi-channel adversarial training for far-field speech recognition
    Tong Niu
    Yaqi Chen
    Dan Qu
    Hengbo Hu
    ChengRan Liu
    EURASIP Journal on Audio, Speech, and Music Processing, 2025 (1)
  • [38] PPEA-Depth: Progressive Parameter-Efficient Adaptation for Self-Supervised Monocular Depth Estimation
    Dong, Yue-Jiang
    Guo, Yuan-Chen
    Liu, Ying-Tian
    Zhang, Fang-Lue
    Zhang, Song-Hai
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1609 - 1617
  • [39] Parameter-Efficient Language Model Tuning with Active Learning in Low-Resource Settings
    Jukic, Josip
    Snajder, Jan
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 5061 - 5074
  • [40] Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA
    He, Shiming
    Lei, Ying
    Zhang, Ying
    Xie, Kun
    Sharma, Pradip Kumar
    2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, ISSRE, 2023, : 207 - 217