Test-Driven Multi-Task Learning with Functionally Equivalent Code Transformation for Neural Code Generation

被引：0

作者：

Wang, Xin ^{[1
]}

Liu, Xiao ^{[2
]}

Zhou, Pingyi ^{[3
]}

Liu, Qixia ^{[4
]}

Liu, Jin ^{[1
]}

Wu, Hao ^{[5
]}

Cui, Xiaohui ^{[6
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China

[2] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia

[3] Huawei Technol, Noahs Ark Lab, Shenzhen, Peoples R China

[4] China Mobile Commun Corp, Suzhou, Peoples R China

[5] Yunnan Univ, Sch Informat Sci & Engn, Kunming, Peoples R China

[6] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan, Peoples R China

来源：

PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

Neural Code Generation; Program Analysis; Execution Feedback; Code Transformation; Multi-Task Learning;

D O I：

10.1145/3551349.3559549

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Automated code generation is a longstanding challenge in both communities of software engineering and artificial intelligence. Currently, some works have started to investigate the functional correctness of code generation, where a code snippet is considered correct if it passes a set of test cases. However, most existing works still model code generation as text generation without considering program-specific information, such as functionally equivalent code snippets and test execution feedback. To address the above limitations, this paper proposes a method combining program analysis with deep learning for neural code generation, where functionally equivalent code snippets and test execution feedback will be considered at the training stage. Concretely, we firstly design several code transformation heuristics to produce different variants of the code snippet satisfying the same functionality. In addition, we employ the test execution feedback and design a test-driven discriminative task to train a novel discriminator, aiming to let the model distinguish whether the generated code is correct or not. The preliminary results on a newly published dataset demonstrate the effectiveness of our proposed framework for code generation. Particularly, in terms of the pass@1 metric, we achieve 8.81 and 11.53 gains compared with CodeGPT and CodeT5, respectively.

引用

页数：6

共 50 条

[31] Investigating Multi-task Learning for Automatic Speech Recognition with Code-switching between Mandarin and English
Song, Xiao
Zou, Yuexian
Huang, Shilei
Chen, Shaobin
Liu, Yi
2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 27 - 30
[32] A unified multi-task learning model for AST-level and token-level code completion
Fang Liu
Ge Li
Bolin Wei
Xin Xia
Zhiyi Fu
Zhi Jin
Empirical Software Engineering, 2022, 27
[33] A unified multi-task learning model for AST-level and token-level code completion
Liu, Fang
Li, Ge
Wei, Bolin
Xia, Xin
Fu, Zhiyi
Jin, Zhi
EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (04)
[34] Metadata-driven Task Relation Discovery for Multi-task Learning
Zheng, Zimu
Wang, Yuqi
Dai, Quanyu
Zheng, Huadi
Wang, Dan
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4426 - 4432
[35] Multi-task Learning for Multilingual Neural Machine Translation
Wang, Yiren
Zhai, ChengXiang
Awadalla, Hany Hassan
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1022 - 1034
[36] Episodic Multi-Task Learning with Heterogeneous Neural Processes
Shen, Jiayi
Zhen, Xiantong
Wang, Qi
Worring, Marcel
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[37] Dynamic Multi-Task Learning with Convolutional Neural Network
Fang, Yuchun
Ma, Zhengyan
Zhang, Zhaoxiang
Zhang, Xu-Yao
Bai, Xiang
PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1668 - 1674
[38] Neural Multi-Task Learning for Citation Function and Provenance
Su, Xuan
Prasad, Animesh
Kan, Min-Yen
Sugiyama, Kazunari
2019 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2019), 2019, : 394 - 395
[39] Scheduled Multi-task Learning for Neural Chat Translation
Liang, Yunlong
Meng, Fandong
Xu, Jinan
Chen, Yufeng
Zhou, Jie
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4375 - 4388
[40] Multi-Task Learning with Language Modeling for Question Generation
Zhou, Wenjie
Zhang, Minghua
Wu, Yunfang
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3394 - 3399

← 1 2 3 4 5 →