HiFi: High-Information Attention Heads Hold for Parameter-Efficient Model Adaptation

被引:0
|
作者
Gui, Anchun [1 ]
Xiao, Han [1 ]
机构
[1] Xiamen Univ, Sch Informat, Dept Artificial Intelligence, Xiamen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To fully leverage the advantages of large-scale pre-trained language models (PLMs) on downstream tasks, it has become a ubiquitous adaptation paradigm to fine-tune the entire parameters of PLMs. However, this paradigm poses issues of inefficient updating and resource over-consuming for fine-tuning in data-scarce and resource-limited scenarios, because of the large scale of parameters in PLMs. To alleviate these concerns, in this paper, we propose a parameter-efficient fine-tuning method HiFi, that is, only the highly informative and strongly correlated attention heads for the specific task are fine-tuned. To search for those significant attention heads, we develop a novel framework to analyze the effectiveness of heads. Specifically, we first model the relationship between heads into a graph from two perspectives of information richness and correlation, and then apply PageRank algorithm to determine the relative importance of each head. Extensive experiments on the GLUE benchmark demonstrate the effectiveness of our method, and show that HiFi obtains state-of-the-art performance over the prior baselines.
引用
收藏
页码:8521 / 8537
页数:17
相关论文
共 50 条
  • [1] Parameter-Efficient Model Adaptation for Vision Transformers
    He, Xuehai
    Li, Chuanyuan
    Zhang, Pengchuan
    Yang, Jianwei
    Wang, Xin Eric
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 817 - 825
  • [2] PARAMETER-EFFICIENT VISION TRANSFORMER WITH LINEAR ATTENTION
    Zhao, Youpeng
    Tang, Huadong
    Jiang, Yingying
    Yong, A.
    Wu, Qiang
    Wang, Jun
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1275 - 1279
  • [3] Parameter-Efficient Tuning with Special Token Adaptation
    Yang, Xiaocong
    Huang, James Y.
    Zhou, Wenxuan
    Chen, Muhao
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 865 - 872
  • [4] Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Action Recognition
    Bandara, Wele Gedara Chaminda
    Patel, Vishal M.
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
  • [5] Refocus the Attention for Parameter-Efficient Thermal Infrared Object Tracking
    Lai, Simiao
    Liu, Chang
    Wang, Dong
    Lu, Huchuan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [6] Parameter-Efficient Learning for Text-to-Speech Accent Adaptation
    Yang, Li-Jen
    Yang, Chao-Han Huck
    Chien, Jen-Tzung
    INTERSPEECH 2023, 2023, : 4354 - 4358
  • [7] Parameter-Efficient Adaptation of Foundation Models for Damaged Building Assessment
    Zhao, Fei
    Zhang, Chengcui
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL, MIPR 2024, 2024, : 417 - 422
  • [8] PARAMETER-EFFICIENT HYDROLOGIC INFILTRATION-MODEL
    SMITH, RE
    PARLANGE, JY
    TRANSACTIONS-AMERICAN GEOPHYSICAL UNION, 1978, 59 (04): : 281 - 281
  • [9] Client-Customized Adaptation for Parameter-Efficient Federated Learning
    Kim, Yeachan
    Kim, Junho
    Mok, Wing-Lam
    Park, Jun-Hyung
    Lee, SangKeun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1159 - 1172
  • [10] PARAMETER-EFFICIENT HYDROLOGIC INFILTRATION-MODEL
    SMITH, RE
    PARLANGE, JY
    WATER RESOURCES RESEARCH, 1978, 14 (03) : 533 - 538