Fine-tuning Language Models for Joint Rewriting and Completion of Code with Potential Bugs

被引：0

作者：

Wang, Dingmin ^{[1
]}

Zhao, Jinman ^{[2
]}

Pei, Hengzhi ^{[2
]}

Tana, Samson ^{[3
]}

Zha, Sheng ^{[3
]}

机构：

[1] Univ Oxford, Oxford, England

[2] Amazon Web Serv, Seattle, WA USA

[3] Amazon AGI, Seattle, WA USA

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024 | 2024年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Handling drafty partial code remains a notable challenge in real-time code suggestion applications. Previous work has demonstrated shortcomings of large language models of code (CodeLLMs) in completing partial code with potential bugs. In this study, we view partial code as implementation hints and finetune CodeLLMs to jointly rewrite and complete partial code into functional full programs. We explore two strategies: one-pass generation and multi-pass iterative refinement. We construct new training and testing datasets using semantic-altering code transformations and iterative self-generations. We conduct comprehensive experiments over three representative open-sourced CodeLLMs - InCoder, CodeGen, and StarCoder. Results show that CodeLLMs fine-tuned using our approach achieve superior pass rates compared to the previous baselines across existing and newly-created benchmarks, effectively handle both potentially buggy and clean code, and largely preserve the integrity of the original partial implementations. We further present findings on the properties of the potential bugs we tested and on the design choices of our methods.

引用

页码：15854 / 15868

页数：15

共 50 条

[1] Enhancing Code Language Models for Program Repair by Curricular Fine-tuning Framework
Hao, Sichong
Shi, Xianjun
Liu, Hongwei
Shu, Yanjun
2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME, 2023, : 136 - 146
[2] An Empirical Study on Fine-tuning Large Language Models of Code for Automated Program Repair
Huang, Kai
Meng, Xiangxin
Zhang, Jian
Liu, Yang
Wang, Wenjie
Li, Shuhao
Zhang, Yuqing
2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1162 - 1174
[3] Fine-Tuning Large Language Models to Improve Accuracy and Comprehensibility of Automated Code Review
Yu, Yongda
Rong, Guoping
Shen, Haifeng
Zhang, He
Shao, Dong
Wang, Min
Wei, Zhao
Xu, Yong
Wang, Juhong
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2025, 34 (01)
[4] Phased Instruction Fine-Tuning for Large Language Models
Pang, Wei
Zhou, Chuan
Zhou, Xiao-Hua
Wang, Xiaojie
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5735 - 5748
[5] CONVFIT: Conversational Fine-Tuning of Pretrained Language Models
Vulic, Ivan
Su, Pei-Hao
Coope, Sam
Gerz, Daniela
Budzianowski, Pawel
Casanueva, Inigo
Mrksic, Nikola
Wen, Tsung-Hsien
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1151 - 1168
[6] Improve Performance of Fine-tuning Language Models with Prompting
Yang, Zijian Gyozo
Ligeti-Nagy, Noenn
INFOCOMMUNICATIONS JOURNAL, 2023, 15 : 62 - 68
[7] HackMentor: Fine-Tuning Large Language Models for Cybersecurity
Zhang, Jie
Wen, Hui
Deng, Liting
Xin, Mingfeng
Li, Zhi
Li, Lun
Zhu, Hongsong
Sun, Limin
2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 452 - 461
[8] Fine-tuning language models to recognize semantic relations
Roussinov, Dmitri
Sharoff, Serge
Puchnina, Nadezhda
LANGUAGE RESOURCES AND EVALUATION, 2023, 57 (04) : 1463 - 1486
[9] Fine-tuning language models to recognize semantic relations
Dmitri Roussinov
Serge Sharoff
Nadezhda Puchnina
Language Resources and Evaluation, 2023, 57 : 1463 - 1486
[10] Fine-Tuning Language Models with Just Forward Passes
Malladi, Sadhika
Gao, Tianyu
Nichani, Eshaan
Damian, Alex
Lee, Jason D.
Chen, Danqi
Arora, Sanjeev
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →