HiFi: High-Information Attention Heads Hold for Parameter-Efficient Model Adaptation

被引：0

作者：

Gui, Anchun ^{[1
]}

Xiao, Han ^{[1
]}

机构：

[1] Xiamen Univ, Sch Informat, Dept Artificial Intelligence, Xiamen, Peoples R China

来源：

PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1 | 2023年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To fully leverage the advantages of large-scale pre-trained language models (PLMs) on downstream tasks, it has become a ubiquitous adaptation paradigm to fine-tune the entire parameters of PLMs. However, this paradigm poses issues of inefficient updating and resource over-consuming for fine-tuning in data-scarce and resource-limited scenarios, because of the large scale of parameters in PLMs. To alleviate these concerns, in this paper, we propose a parameter-efficient fine-tuning method HiFi, that is, only the highly informative and strongly correlated attention heads for the specific task are fine-tuned. To search for those significant attention heads, we develop a novel framework to analyze the effectiveness of heads. Specifically, we first model the relationship between heads into a graph from two perspectives of information richness and correlation, and then apply PageRank algorithm to determine the relative importance of each head. Extensive experiments on the GLUE benchmark demonstrate the effectiveness of our method, and show that HiFi obtains state-of-the-art performance over the prior baselines.

引用

页码：8521 / 8537

页数：17

共 50 条

[1] Parameter-Efficient Model Adaptation for Vision Transformers
He, Xuehai
Li, Chuanyuan
Zhang, Pengchuan
Yang, Jianwei
Wang, Xin Eric
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 817 - 825
[2] PARAMETER-EFFICIENT VISION TRANSFORMER WITH LINEAR ATTENTION
Zhao, Youpeng
Tang, Huadong
Jiang, Yingying
Yong, A.
Wu, Qiang
Wang, Jun
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1275 - 1279
[3] Parameter-Efficient Tuning with Special Token Adaptation
Yang, Xiaocong
Huang, James Y.
Zhou, Wenxuan
Chen, Muhao
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 865 - 872
[4] Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Action Recognition
Bandara, Wele Gedara Chaminda
Patel, Vishal M.
2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
[5] Refocus the Attention for Parameter-Efficient Thermal Infrared Object Tracking
Lai, Simiao
Liu, Chang
Wang, Dong
Lu, Huchuan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[6] Parameter-Efficient Learning for Text-to-Speech Accent Adaptation
Yang, Li-Jen
Yang, Chao-Han Huck
Chien, Jen-Tzung
INTERSPEECH 2023, 2023, : 4354 - 4358
[7] Parameter-Efficient Adaptation of Foundation Models for Damaged Building Assessment
Zhao, Fei
Zhang, Chengcui
2024 IEEE 7TH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL, MIPR 2024, 2024, : 417 - 422
[8] PARAMETER-EFFICIENT HYDROLOGIC INFILTRATION-MODEL
SMITH, RE
PARLANGE, JY
TRANSACTIONS-AMERICAN GEOPHYSICAL UNION, 1978, 59 (04): : 281 - 281
[9] Client-Customized Adaptation for Parameter-Efficient Federated Learning
Kim, Yeachan
Kim, Junho
Mok, Wing-Lam
Park, Jun-Hyung
Lee, SangKeun
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1159 - 1172
[10] PARAMETER-EFFICIENT HYDROLOGIC INFILTRATION-MODEL
SMITH, RE
PARLANGE, JY
WATER RESOURCES RESEARCH, 1978, 14 (03) : 533 - 538

← 1 2 3 4 5 →