Towards Efficient Fine-Tuning of Pre-trained Code Models: An Experimental Study and Beyond

被引：2

作者：

Shi, Ensheng ^{[1
,5
]}

Wang, Yanlin ^{[2
,5
]}

Zhang, Hongyu ^{[3
]}

Du, Lun ^{[4
]}

Han, Shi ^{[4
]}

Zhang, Dongmei ^{[4
]}

Sun, Hongbin ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Xian, Peoples R China

[2] Sun Yat Sen Univ, Zhuhai, Peoples R China

[3] Chongqing Univ, Chongqing, Peoples R China

[4] Microsoft, Beijing, Peoples R China

[5] Microsoft Res Asia, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023 | 2023年

基金：

国家重点研发计划;

关键词：

Empirical study; Pre-Trained Language Models; Efficient Fine-tuning; Probing Techniques; Representational Similarity Analysis;

D O I：

10.1145/3597926.3598036

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Recently, fine-tuning pre-trained code models such as CodeBERT on downstream tasks has achieved great success in many software testing and analysis tasks. While effective and prevalent, fine-tuning the pre-trained parameters incurs a large computational cost. In this paper, we conduct an extensive experimental study to explore what happens to layer-wise pre-trained representations and their encoded code knowledge during fine-tuning. We then propose efficient alternatives to fine-tune the large pre-trained code model based on the above findings. Our experimental study shows that (1) lexical, syntactic and structural properties of source code are encoded in the lower, intermediate, and higher layers, respectively, while the semantic property spans across the entire model. (2) The process of fine-tuning preserves most of the code properties. Specifically, the basic code properties captured by lower and intermediate layers are still preserved during fine-tuning. Furthermore, we find that only the representations of the top two layers change most during fine-tuning for various downstream tasks. (3) Based on the above findings, we propose Telly to efficiently fine-tune pre-trained code models via layer freezing. The extensive experimental results on five various downstream tasks demonstrate that training parameters and the corresponding time cost are greatly reduced, while performances are similar or better.

引用

页码：39 / 51

页数：13

共 50 条

[21] Fine-Tuning BERT-Based Pre-Trained Models for Arabic Dependency Parsing
Al-Ghamdi, Sharefah
Al-Khalifa, Hend
Al-Salman, Abdulmalik
APPLIED SCIENCES-BASEL, 2023, 13 (07):
[22] Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction
Alt, Christoph
Huebner, Marc
Hennig, Leonhard
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1388 - 1398
[23] Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models
Zhou, Kun
Zhao, Wayne Xin
Wang, Sirui
Zhang, Fuzheng
Wu, Wei
We, Ji-Rong
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3875 - 3887
[24] Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
Tang, Yiwen
Zhang, Ray
Guo, Zoey
Ma, Xianzheng
Zhao, Bin
Wang, Zhigang
Wang, Dong
Li, Xuelong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5171 - 5179
[25] Disfluencies and Fine-Tuning Pre-trained Language Models for Detection of Alzheimer's Disease
Yuan, Jiahong
Bian, Yuchen
Cai, Xingyu
Huang, Jiaji
Ye, Zheng
Church, Kenneth
INTERSPEECH 2020, 2020, : 2162 - 2166
[26] Confounder balancing in adversarial domain adaptation for pre-trained large models fine-tuning
Jiang, Shuoran
Chen, Qingcai
Xiang, Yang
Pan, Youcheng
Wu, Xiangping
Lin, Yukang
NEURAL NETWORKS, 2024, 173
[27] SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
Jiang, Haoming
He, Pengcheng
Chen, Weizhu
Liu, Xiaodong
Gao, Jianfeng
Zhao, Tuo
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2177 - 2190
[28] Empirical study on fine-tuning pre-trained large language models for fault diagnosis of complex systems
Zheng, Shuwen
Pan, Kai
Liu, Jie
Chen, Yunxia
RELIABILITY ENGINEERING & SYSTEM SAFETY, 2024, 252
[29] Variational Monte Carlo on a Budget - Fine-tuning pre-trained NeuralWavefunctions
Scherbela, Michael
Gerard, Leon
Grohs, Philipp
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[30] Bridging pre-trained models to continual learning: A hypernetwork based framework with parameter-efficient fine-tuning techniques
Ding, Fengqian
Xu, Chen
Liu, Han
Zhou, Bin
Zhou, Hongchao
INFORMATION SCIENCES, 2024, 674

← 1 2 3 4 5 →