Test-Driven Multi-Task Learning with Functionally Equivalent Code Transformation for Neural Code Generation

被引：0

作者：

Wang, Xin ^{[1
]}

Liu, Xiao ^{[2
]}

Zhou, Pingyi ^{[3
]}

Liu, Qixia ^{[4
]}

Liu, Jin ^{[1
]}

Wu, Hao ^{[5
]}

Cui, Xiaohui ^{[6
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China

[2] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia

[3] Huawei Technol, Noahs Ark Lab, Shenzhen, Peoples R China

[4] China Mobile Commun Corp, Suzhou, Peoples R China

[5] Yunnan Univ, Sch Informat Sci & Engn, Kunming, Peoples R China

[6] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan, Peoples R China

来源：

PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

Neural Code Generation; Program Analysis; Execution Feedback; Code Transformation; Multi-Task Learning;

D O I：

10.1145/3551349.3559549

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Automated code generation is a longstanding challenge in both communities of software engineering and artificial intelligence. Currently, some works have started to investigate the functional correctness of code generation, where a code snippet is considered correct if it passes a set of test cases. However, most existing works still model code generation as text generation without considering program-specific information, such as functionally equivalent code snippets and test execution feedback. To address the above limitations, this paper proposes a method combining program analysis with deep learning for neural code generation, where functionally equivalent code snippets and test execution feedback will be considered at the training stage. Concretely, we firstly design several code transformation heuristics to produce different variants of the code snippet satisfying the same functionality. In addition, we employ the test execution feedback and design a test-driven discriminative task to train a novel discriminator, aiming to let the model distinguish whether the generated code is correct or not. The preliminary results on a newly published dataset demonstrate the effectiveness of our proposed framework for code generation. Particularly, in terms of the pass@1 metric, we achieve 8.81 and 11.53 gains compared with CodeGPT and CodeT5, respectively.

引用

页数：6

共 50 条

[1] A Self-Attentional Neural Architecture for Code Completion with Multi-Task Learning
Liu, Fang
Li, Ge
Wei, Bolin
Xia, Xin
Fu, Zhiyi
Jin, Zhi
2020 IEEE/ACM 28TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC, 2020, : 37 - 47
[2] Test-Driven Code Review: An Empirical Study
Spadini, Davide
Palomba, Fabio
Baum, Tobias
Hanenberg, Stefan
Bruntink, Magiel
Bacchelli, Alberto
2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019), 2019, : 1061 - 1072
[3] The effect of test-driven development on program code
Mueller, Matthias M.
EXTREME PROGRAMMING AND AGILE PROCESSES IN SOFTWARE ENGINEERING, PROCEEDINGS, 2006, 4044 : 94 - 103
[4] MulCode: A Multi-task Learning Approach for Source Code Understanding
Wang, Deze
Yu, Yue
Li, Shanshan
Dong, Wei
Wang, Ji
Qing, Liao
2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021), 2021, : 48 - 59
[5] CoTexT: Multi-task Learning with Code-Text Transformer
Long Phan
Hieu Tran
Le, Daniel
Hieu Nguyen
Anibal, James
Peltekian, Alec
Ye, Yanfang
NLP4PROG 2021: THE 1ST WORKSHOP ON NATURAL LANGUAGE PROCESSING FOR PROGRAMMING (NLP4PROG 2021), 2021, : 40 - 47
[6] A syntax-guided multi-task learning approach for Turducken-style code generation
Guang Yang
Yu Zhou
Xiang Chen
Xiangyu Zhang
Yiran Xu
Tingting Han
Taolue Chen
Empirical Software Engineering, 2023, 28
[7] A syntax-guided multi-task learning approach for Turducken-style code generation
Yang, Guang
Zhou, Yu
Chen, Xiang
Zhang, Xiangyu
Xu, Yiran
Han, Tingting
Chen, Taolue
EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (06)
[8] LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation
Fakhoury, Sarah
Naik, Aaditya
Sakkas, Georgios
Chakraborty, Saikat
Lahiri, Shuvendu K.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2024, 50 (09) : 2254 - 2268
[9] AdapterFusion-based multi-task learning for code-mixed and code-switched text classification
Rathnayake, Himashi
Sumanapala, Janani
Rukshani, Raveesha
Ranathunga, Surangika
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
[10] Neural Comment Generation for Source Code with Auxiliary Code Classification Task
Chen, Minghao
Wan, Xiaojun
2019 26TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC), 2019, : 522 - 529

← 1 2 3 4 5 →