Multi-stage guided code generation for Large Language Models

被引：0

作者：

Han, Yewei ^{[1
]}

Lyu, Chen ^{[1
]}

机构：

[1] Shandong Normal Univ, Sch Informat Sci & Engn, Shandong Prov Key Lab Distributed Comp Software No, Univ Rd 1, Jinan, Peoples R China

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2025年 / 139卷

关键词：

Code generation; Multi-stage; Large Language Models; Prompt technique;

D O I：

10.1016/j.engappai.2024.109491

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Currently, although Large Language Models (LLMs) have shown significant performance in the field of code generation, their effectiveness in handling complex programming tasks remains limited. This is primarily due to the substantial distance between the problem description and the correct code, making it difficult to ensure accuracy when directly generating code. Human programmers, when faced with a complex programming problem, usually use multiple stages to solve it in order to reduce the difficulty of development. First, they analyze the problem and think about a solution plan, then they design a code architecture based on that plan, and finally they finish writing the detailed code. Based on this, we propose a multi-stage guided code generation strategy that aims to gradually shorten the transformation distance between the problem description and the correct code, thus improving the accuracy of code generation. Specifically, the approach consists of three stages: planning, design and implementation. In the planning phase, the Large Language Model (LLM) generates a solution plan based on the problem description; in the design phase, the code architecture is further designed based on the solution plan; and in the implementation phase, the previous solution plan and code architecture are utilized to guide the LLM in generating the final code. Additionally, we found that existing competition-level code generation benchmarks may overlap with the training data of the Chat Generative Pre-trained Transformer (ChatGPT), posing a risk of data leakage. To validate the above findings and circumvent this risk, we created a competition-level code generation dataset named CodeC, which contains data never used for training ChatGPT. Experimental results show that our method outperforms the most advanced baselines. On the CodeC dataset, our approach achieves a 34.7% relative improvement on the Pass@1 metric compared to the direct generation method of ChatGPT. We have published the relevant dataset at https://github.com/hcode666/MSG for further academic research and validation.

引用

页数：13

共 50 条

[31] Hot or Cold? Adaptive Temperature Sampling for Code Generation with Large Language Models
Zhu, Yuqi
Li, Jia
Li, Ge
Zhao, YunFei
Li, Jia
Jin, Zhi
Mei, Hong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 437 - 445
[32] Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model
Liu, Ting
Shi, Liangtao
Hong, Richang
Hu, Yue
Yin, Quanjun
Zhang, Linfeng
arXiv,
[33] Multi-Stage Prompting for Knowledgeable Dialogue Generation
Liu, Zihan
Patwary, Mostofa
Prenger, Ryan
Prabhumoye, Shrimai
Ping, Wei
Shoeybi, Mohammad
Catanzaro, Bryan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1317 - 1337
[34] An effective multi-stage background generation algorithm
Bevilacqua, A
Di Stefano, L
Lanza, A
AVSS 2005: ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, PROCEEDINGS, 2005, : 388 - 393
[35] Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
Liu, Jiawei
Xia, Chunqiu Steven
Wang, Yuyao
Zhang, Lingming
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[36] Multi-stage Programming in the Large with Staged Classes
Parreaux, Lionel
Shaikhha, Amir
GPCE '2020: PROCEEDINGS OF THE 19TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON GENERATIVE PROGRAMMING: CONCEPTS AND EXPERIENCES, 2020, : 35 - 49
[37] MULTI-STAGE DECISION MAKING MODELS IN PSYCHOLOGY
KLEITER, GD
PSYCHOLOGISCHE BEITRAGE, 1974, 16 (01): : 93 - 127
[38] Multi-stage validation of pesticide leaching models
Armstrong, AC
Portwood, AM
Harris, GL
LeedsHarrison, PB
Catt, JA
ENVIRONMENTAL FATE OF XENOBIOTICS, 1996, : 321 - 328
[39] L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models
Ni, Ansong
Yin, Pengcheng
Zhao, Yilun
Riddell, Martin
Feng, Troy
Shen, Rui
Yin, Stephen
Liu, Ye
Yavuz, Semih
Xiong, Caiming
Joty, Shafiq
Zhou, Yingbo
Radev, Dragomir
Cohan, Arman
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1311 - 1329
[40] Experiences with an object-oriented, multi-stage language
Neverov, Gregory
Roe, Paul
SCIENCE OF COMPUTER PROGRAMMING, 2006, 62 (01) : 85 - 94

← 1 2 3 4 5 →