Towards Efficient Fine-Tuning of Pre-trained Code Models: An Experimental Study and Beyond

被引:2
|
作者
Shi, Ensheng [1 ,5 ]
Wang, Yanlin [2 ,5 ]
Zhang, Hongyu [3 ]
Du, Lun [4 ]
Han, Shi [4 ]
Zhang, Dongmei [4 ]
Sun, Hongbin [1 ]
机构
[1] Xi An Jiao Tong Univ, Xian, Peoples R China
[2] Sun Yat Sen Univ, Zhuhai, Peoples R China
[3] Chongqing Univ, Chongqing, Peoples R China
[4] Microsoft, Beijing, Peoples R China
[5] Microsoft Res Asia, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
Empirical study; Pre-Trained Language Models; Efficient Fine-tuning; Probing Techniques; Representational Similarity Analysis;
D O I
10.1145/3597926.3598036
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recently, fine-tuning pre-trained code models such as CodeBERT on downstream tasks has achieved great success in many software testing and analysis tasks. While effective and prevalent, fine-tuning the pre-trained parameters incurs a large computational cost. In this paper, we conduct an extensive experimental study to explore what happens to layer-wise pre-trained representations and their encoded code knowledge during fine-tuning. We then propose efficient alternatives to fine-tune the large pre-trained code model based on the above findings. Our experimental study shows that (1) lexical, syntactic and structural properties of source code are encoded in the lower, intermediate, and higher layers, respectively, while the semantic property spans across the entire model. (2) The process of fine-tuning preserves most of the code properties. Specifically, the basic code properties captured by lower and intermediate layers are still preserved during fine-tuning. Furthermore, we find that only the representations of the top two layers change most during fine-tuning for various downstream tasks. (3) Based on the above findings, we propose Telly to efficiently fine-tune pre-trained code models via layer freezing. The extensive experimental results on five various downstream tasks demonstrate that training parameters and the corresponding time cost are greatly reduced, while performances are similar or better.
引用
收藏
页码:39 / 51
页数:13
相关论文
共 50 条
  • [21] Fine-Tuning BERT-Based Pre-Trained Models for Arabic Dependency Parsing
    Al-Ghamdi, Sharefah
    Al-Khalifa, Hend
    Al-Salman, Abdulmalik
    APPLIED SCIENCES-BASEL, 2023, 13 (07):
  • [22] Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction
    Alt, Christoph
    Huebner, Marc
    Hennig, Leonhard
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1388 - 1398
  • [23] Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models
    Zhou, Kun
    Zhao, Wayne Xin
    Wang, Sirui
    Zhang, Fuzheng
    Wu, Wei
    We, Ji-Rong
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3875 - 3887
  • [24] Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
    Tang, Yiwen
    Zhang, Ray
    Guo, Zoey
    Ma, Xianzheng
    Zhao, Bin
    Wang, Zhigang
    Wang, Dong
    Li, Xuelong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5171 - 5179
  • [25] Disfluencies and Fine-Tuning Pre-trained Language Models for Detection of Alzheimer's Disease
    Yuan, Jiahong
    Bian, Yuchen
    Cai, Xingyu
    Huang, Jiaji
    Ye, Zheng
    Church, Kenneth
    INTERSPEECH 2020, 2020, : 2162 - 2166
  • [26] Confounder balancing in adversarial domain adaptation for pre-trained large models fine-tuning
    Jiang, Shuoran
    Chen, Qingcai
    Xiang, Yang
    Pan, Youcheng
    Wu, Xiangping
    Lin, Yukang
    NEURAL NETWORKS, 2024, 173
  • [27] SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
    Jiang, Haoming
    He, Pengcheng
    Chen, Weizhu
    Liu, Xiaodong
    Gao, Jianfeng
    Zhao, Tuo
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2177 - 2190
  • [28] Empirical study on fine-tuning pre-trained large language models for fault diagnosis of complex systems
    Zheng, Shuwen
    Pan, Kai
    Liu, Jie
    Chen, Yunxia
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2024, 252
  • [29] Variational Monte Carlo on a Budget - Fine-tuning pre-trained NeuralWavefunctions
    Scherbela, Michael
    Gerard, Leon
    Grohs, Philipp
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [30] Bridging pre-trained models to continual learning: A hypernetwork based framework with parameter-efficient fine-tuning techniques
    Ding, Fengqian
    Xu, Chen
    Liu, Han
    Zhou, Bin
    Zhou, Hongchao
    INFORMATION SCIENCES, 2024, 674