Towards Efficient Fine-Tuning of Pre-trained Code Models: An Experimental Study and Beyond

被引：2

作者：

Shi, Ensheng ^{[1
,5
]}

Wang, Yanlin ^{[2
,5
]}

Zhang, Hongyu ^{[3
]}

Du, Lun ^{[4
]}

Han, Shi ^{[4
]}

Zhang, Dongmei ^{[4
]}

Sun, Hongbin ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Xian, Peoples R China

[2] Sun Yat Sen Univ, Zhuhai, Peoples R China

[3] Chongqing Univ, Chongqing, Peoples R China

[4] Microsoft, Beijing, Peoples R China

[5] Microsoft Res Asia, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023 | 2023年

基金：

国家重点研发计划;

关键词：

Empirical study; Pre-Trained Language Models; Efficient Fine-tuning; Probing Techniques; Representational Similarity Analysis;

D O I：

10.1145/3597926.3598036

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Recently, fine-tuning pre-trained code models such as CodeBERT on downstream tasks has achieved great success in many software testing and analysis tasks. While effective and prevalent, fine-tuning the pre-trained parameters incurs a large computational cost. In this paper, we conduct an extensive experimental study to explore what happens to layer-wise pre-trained representations and their encoded code knowledge during fine-tuning. We then propose efficient alternatives to fine-tune the large pre-trained code model based on the above findings. Our experimental study shows that (1) lexical, syntactic and structural properties of source code are encoded in the lower, intermediate, and higher layers, respectively, while the semantic property spans across the entire model. (2) The process of fine-tuning preserves most of the code properties. Specifically, the basic code properties captured by lower and intermediate layers are still preserved during fine-tuning. Furthermore, we find that only the representations of the top two layers change most during fine-tuning for various downstream tasks. (3) Based on the above findings, we propose Telly to efficiently fine-tune pre-trained code models via layer freezing. The extensive experimental results on five various downstream tasks demonstrate that training parameters and the corresponding time cost are greatly reduced, while performances are similar or better.

引用

页码：39 / 51

页数：13

共 50 条

[41] Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection
Uppaal, Rheeya
Hu, Junjie
Li, Yixuan
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 12813 - 12832
[42] Enhancing Machine-Generated Text Detection: Adversarial Fine-Tuning of Pre-Trained Language Models
Hee Lee, Dong
Jang, Beakcheol
IEEE ACCESS, 2024, 12 : 65333 - 65340
[43] Comparative Study of Fine-Tuning of Pre-Trained Convolutional Neural Networks for Diabetic Retinopathy Screening
Mohammadian, Saboora
Karsaz, Ali
Roshan, Yaser M.
2017 24TH NATIONAL AND 2ND INTERNATIONAL IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING (ICBME), 2017, : 224 - 229
[44] Fine-Tuning Pre-Trained Model to Extract Undesired Behaviors from App Reviews
Zhang, Wenyu
Wang, Xiaojuan
Lai, Shanyan
Ye, Chunyang
Zhou, Hui
2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 1125 - 1134
[45] Matching tasks to objectives: Fine-tuning and prompt-tuning strategies for encoder-decoder pre-trained language models
Pouramini, Ahmad
Faili, Hesham
APPLIED INTELLIGENCE, 2024, 54 (20) : 9783 - 9810
[46] Fine-Tuning of Pre-Trained Deep Face Sketch Models Using Smart Switching Slime Mold Algorithm
Alhashash, Khaled Mohammad
Samma, Hussein
Suandi, Shahrel Azmin
APPLIED SCIENCES-BASEL, 2023, 13 (08):
[47] CSS-LM: A Contrastive Framework for Semi-Supervised Fine-Tuning of Pre-Trained Language Models
Su, Yusheng
Han, Xu
Lin, Yankai
Zhang, Zhengyan
Liu, Zhiyuan
Li, Peng
Zhou, Jie
Sun, Maosong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2930 - 2941
[48] Advancing the Boundary of Pre-trained Models for Drug Discovery: Interpretable Fine-Tuning Empowered by Molecular Physicochemical Properties
Lian X.
Zhu J.
Lv T.
Hong X.
Ding L.
Chu W.
Ni J.
Pan X.
IEEE Journal of Biomedical and Health Informatics, 2024, 28 (12) : 1 - 16
[49] Sentiment Analysis Using Pre-Trained Language Model With No Fine-Tuning and Less Resource
Kit, Yuheng
Mokji, Musa Mohd
IEEE ACCESS, 2022, 10 : 107056 - 107065
[50] HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation
Yuan, Hongyi
Yuan, Zheng
Tan, Chuanqi
Huang, Fei
Huang, Songfang
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3246 - 3264

← 1 2 3 4 5 →