Impact of Code Language Models on Automated Program Repair

被引:36
|
作者
Jiang, Nan [1 ]
Liu, Kevin [2 ]
Lutellier, Thibaud [3 ]
Tan, Lin [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
[2] Lynbrook High Sch, San Jose, CA USA
[3] Univ Alberta, Edmonton, AB, Canada
关键词
Automated Program Repair; Code Language Model; Fine-Tuning; Deep Learning;
D O I
10.1109/ICSE48619.2023.00125
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Automated program repair (APR) aims to help developers improve software reliability by generating patches for buggy programs. Although many code language models (CLM) are developed and effective in many software tasks such as code completion, there has been little comprehensive, in-depth work to evaluate CLMs' fixing capabilities and to fine-tune CLMs for the APR task. Firstly, this work is the first to evaluate ten CLMs on four APR benchmarks, which shows that surprisingly, the best CLM, as is, fixes 72% more bugs than the state-of-the-art deep-learning (DL)-based APR techniques. Secondly, one of the four APR benchmarks was created by us in this paper to avoid data leaking for a fair evaluation. Thirdly, it is the first work to fine-tune CLMs with APR training data, which shows that finetuning brings 31%-1,267% improvement to CLMs and enables them to fix 46%-164% more bugs than existing DL-based APR techniques. Fourthly, this work studies the impact of buggy lines, showing that CLMs, as is, cannot make good use of the buggy lines to fix bugs, yet fine-tuned CLMs could potentially over-rely on buggy lines. Lastly, this work analyzes the size, time, and memory efficiency of different CLMs. This work shows promising directions for the APR domain, such as fine-tuning CLMs with APR-specific designs, and also raises awareness of fair and comprehensive evaluations of CLMs and calls for more transparent reporting of open-source repositories used in the pre-training data to address the data leaking problem.
引用
收藏
页码:1430 / 1442
页数:13
相关论文
共 50 条
  • [21] Automated Program Repair by Using Similar Code Containing Fix Ingredients
    Ji, Tao
    Chen, Liqian
    Mao, Xiaoguang
    Yi, Xin
    PROCEEDINGS 2016 IEEE 40TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS, VOL 1, 2016, : 197 - 202
  • [22] DLFix: Context-based Code Transformation Automated Program Repair
    Li, Yi
    Wang, Shaohua
    Nguyen, Tien N.
    2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 602 - 614
  • [23] A Novel Approach for Automated Program Repair using Round-Trip Translation with Large Language Models
    Ruiz, Fernando Vallecillos
    Grishina, Anastasiia
    Hort, Max
    Moonen, Leon
    arXiv,
  • [24] Large Language Models in Automated Repair of Haskell Type Errors
    Santos, Sofia
    Saraiva, Joao
    Ribeiro, Francisco
    2024 ACM/IEEE INTERNATIONAL WORKSHOP ON AUTOMATED PROGRAM REPAIR, APR 2024, 2024, : 42 - 45
  • [25] Automated C/C plus plus Program Repair for High -Level Synthesis via Large Language Models
    Xu, Kangwei
    Zhang, Grace Li
    Yin, Xunzhao
    Zhuo, Chang
    Schlichtmann, Ulf
    Li, Bing
    2024 ACM/IEEE 6TH SYMPOSIUM ON MACHINE LEARNING FOR CAD, MLCAD 2024, 2024,
  • [26] Exploring and Unleashing the Power of Large Language Models in Automated Code Translation
    Yang, Zhen
    Liu, Fang
    Yu, Zhongxing
    Keung, Jacky Wai
    Li, Jia
    Liu, Shuo
    Hong, Yifan
    Ma, Xiaoxue
    Jin, Zhi
    Li, Ge
    arXiv,
  • [27] A Novel Fitness Function for Automated Program Repair Based on Source Code Checkpoints
    de Souza, Eduardo Faria
    Le Goues, Claire
    Camilo, Celso Goncalves
    GECCO'18: PROCEEDINGS OF THE 2018 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2018, : 1443 - 1450
  • [28] Harnessing the Power of Large Language Models for Automated Code Generation and Verification
    Antero, Unai
    Blanco, Francisco
    Onativia, Jon
    Salle, Damien
    Sierra, Basilio
    ROBOTICS, 2024, 13 (09)
  • [29] Benchmarking Large Language Models for Automated Verilog RTL Code Generation
    Thakur, Shailja
    Ahmad, Baleegh
    Fan, Zhenxing
    Pearce, Hammond
    Tan, Benjamin
    Karri, Ramesh
    Dolan-Gavitt, Brendan
    Garg, Siddharth
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [30] TransplantFix: Graph Differencing-based Code Transplantation for Automated Program Repair
    Yang, Deheng
    Mao, Xiaoguang
    Chen, Liqian
    Xu, Xuezheng
    Lei, Yan
    Lo, David
    He, Jiayu
    PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,