Towards Efficient Fine-Tuning of Pre-trained Code Models: An Experimental Study and Beyond

被引:2
|
作者
Shi, Ensheng [1 ,5 ]
Wang, Yanlin [2 ,5 ]
Zhang, Hongyu [3 ]
Du, Lun [4 ]
Han, Shi [4 ]
Zhang, Dongmei [4 ]
Sun, Hongbin [1 ]
机构
[1] Xi An Jiao Tong Univ, Xian, Peoples R China
[2] Sun Yat Sen Univ, Zhuhai, Peoples R China
[3] Chongqing Univ, Chongqing, Peoples R China
[4] Microsoft, Beijing, Peoples R China
[5] Microsoft Res Asia, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
Empirical study; Pre-Trained Language Models; Efficient Fine-tuning; Probing Techniques; Representational Similarity Analysis;
D O I
10.1145/3597926.3598036
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recently, fine-tuning pre-trained code models such as CodeBERT on downstream tasks has achieved great success in many software testing and analysis tasks. While effective and prevalent, fine-tuning the pre-trained parameters incurs a large computational cost. In this paper, we conduct an extensive experimental study to explore what happens to layer-wise pre-trained representations and their encoded code knowledge during fine-tuning. We then propose efficient alternatives to fine-tune the large pre-trained code model based on the above findings. Our experimental study shows that (1) lexical, syntactic and structural properties of source code are encoded in the lower, intermediate, and higher layers, respectively, while the semantic property spans across the entire model. (2) The process of fine-tuning preserves most of the code properties. Specifically, the basic code properties captured by lower and intermediate layers are still preserved during fine-tuning. Furthermore, we find that only the representations of the top two layers change most during fine-tuning for various downstream tasks. (3) Based on the above findings, we propose Telly to efficiently fine-tune pre-trained code models via layer freezing. The extensive experimental results on five various downstream tasks demonstrate that training parameters and the corresponding time cost are greatly reduced, while performances are similar or better.
引用
收藏
页码:39 / 51
页数:13
相关论文
共 50 条
  • [41] Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection
    Uppaal, Rheeya
    Hu, Junjie
    Li, Yixuan
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 12813 - 12832
  • [42] Enhancing Machine-Generated Text Detection: Adversarial Fine-Tuning of Pre-Trained Language Models
    Hee Lee, Dong
    Jang, Beakcheol
    IEEE ACCESS, 2024, 12 : 65333 - 65340
  • [43] Comparative Study of Fine-Tuning of Pre-Trained Convolutional Neural Networks for Diabetic Retinopathy Screening
    Mohammadian, Saboora
    Karsaz, Ali
    Roshan, Yaser M.
    2017 24TH NATIONAL AND 2ND INTERNATIONAL IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING (ICBME), 2017, : 224 - 229
  • [44] Fine-Tuning Pre-Trained Model to Extract Undesired Behaviors from App Reviews
    Zhang, Wenyu
    Wang, Xiaojuan
    Lai, Shanyan
    Ye, Chunyang
    Zhou, Hui
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 1125 - 1134
  • [45] Matching tasks to objectives: Fine-tuning and prompt-tuning strategies for encoder-decoder pre-trained language models
    Pouramini, Ahmad
    Faili, Hesham
    APPLIED INTELLIGENCE, 2024, 54 (20) : 9783 - 9810
  • [46] Fine-Tuning of Pre-Trained Deep Face Sketch Models Using Smart Switching Slime Mold Algorithm
    Alhashash, Khaled Mohammad
    Samma, Hussein
    Suandi, Shahrel Azmin
    APPLIED SCIENCES-BASEL, 2023, 13 (08):
  • [47] CSS-LM: A Contrastive Framework for Semi-Supervised Fine-Tuning of Pre-Trained Language Models
    Su, Yusheng
    Han, Xu
    Lin, Yankai
    Zhang, Zhengyan
    Liu, Zhiyuan
    Li, Peng
    Zhou, Jie
    Sun, Maosong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2930 - 2941
  • [48] Advancing the Boundary of Pre-trained Models for Drug Discovery: Interpretable Fine-Tuning Empowered by Molecular Physicochemical Properties
    Lian X.
    Zhu J.
    Lv T.
    Hong X.
    Ding L.
    Chu W.
    Ni J.
    Pan X.
    IEEE Journal of Biomedical and Health Informatics, 2024, 28 (12) : 1 - 16
  • [49] Sentiment Analysis Using Pre-Trained Language Model With No Fine-Tuning and Less Resource
    Kit, Yuheng
    Mokji, Musa Mohd
    IEEE ACCESS, 2022, 10 : 107056 - 107065
  • [50] HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation
    Yuan, Hongyi
    Yuan, Zheng
    Tan, Chuanqi
    Huang, Fei
    Huang, Songfang
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3246 - 3264