Multi-stage guided code generation for Large Language Models

被引：0

作者：

Han, Yewei ^{[1
]}

Lyu, Chen ^{[1
]}

机构：

[1] Shandong Normal Univ, Sch Informat Sci & Engn, Shandong Prov Key Lab Distributed Comp Software No, Univ Rd 1, Jinan, Peoples R China

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2025年 / 139卷

关键词：

Code generation; Multi-stage; Large Language Models; Prompt technique;

D O I：

10.1016/j.engappai.2024.109491

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Currently, although Large Language Models (LLMs) have shown significant performance in the field of code generation, their effectiveness in handling complex programming tasks remains limited. This is primarily due to the substantial distance between the problem description and the correct code, making it difficult to ensure accuracy when directly generating code. Human programmers, when faced with a complex programming problem, usually use multiple stages to solve it in order to reduce the difficulty of development. First, they analyze the problem and think about a solution plan, then they design a code architecture based on that plan, and finally they finish writing the detailed code. Based on this, we propose a multi-stage guided code generation strategy that aims to gradually shorten the transformation distance between the problem description and the correct code, thus improving the accuracy of code generation. Specifically, the approach consists of three stages: planning, design and implementation. In the planning phase, the Large Language Model (LLM) generates a solution plan based on the problem description; in the design phase, the code architecture is further designed based on the solution plan; and in the implementation phase, the previous solution plan and code architecture are utilized to guide the LLM in generating the final code. Additionally, we found that existing competition-level code generation benchmarks may overlap with the training data of the Chat Generative Pre-trained Transformer (ChatGPT), posing a risk of data leakage. To validate the above findings and circumvent this risk, we created a competition-level code generation dataset named CodeC, which contains data never used for training ChatGPT. Experimental results show that our method outperforms the most advanced baselines. On the CodeC dataset, our approach achieves a 34.7% relative improvement on the Pass@1 metric compared to the direct generation method of ChatGPT. We have published the relevant dataset at https://github.com/hcode666/MSG for further academic research and validation.

引用

页数：13

共 50 条

[21] Influence of the number of decision stages on multi-stage renewable generation expansion models
Dominguez, R.
Carrion, M.
Conejo, A. J.
INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2021, 126
[22] Evaluating emotional and subjective responses in synthetic art-related dialogues: A multi-stage framework with large language models
Luna-Jimenez, Cristina
Gil-Martin, Manuel
D'Haro, Luis Fernando
Fernandez-Martinez, Fernando
San-Segundo, Ruben
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
[23] Evaluation of a multi-stage guided search approach for the calibration of building energy simulation models
Cipriano, J.
Mor, G.
Chemisana, D.
Perez, D.
Gamboa, G.
Cipriano, X.
ENERGY AND BUILDINGS, 2015, 87 : 370 - 385
[24] Multi-Stage Satellite Phase and Code Bias Estimation
Wen, Zhibo
Henkel, Patrick
Guenther, Christoph
AUTOMATIKA, 2012, 53 (04) : 373 - 381
[25] A Novel Multi-Stage Prompting Approach for Language Agnostic MCQ Generation Using GPT
Maity, Subhankar
Deroy, Aniket
Sarkar, Sudeshna
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III, 2024, 14610 : 268 - 277
[26] BuildIt: A Type-Based Multi-stage Programming Framework for Code Generation in C plus
Brahmakshatriya, Ajay
Amarasinghe, Saman
CGO '21: PROCEEDINGS OF THE 2021 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2021, : 39 - 51
[27] Invited Paper: VerilogEval: Evaluating Large Language Models for Verilog Code Generation
Liu, Mingjie
Pinckney, Nathaniel
Khailany, Brucek
Ren, Haoxing
2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
[28] Code-level quantum circuit generation based on large language models
He, Zhimin
Li, Guohong
Situ, Haozhen
Zhou, Yan
Zheng, Shenggen
Li, Lvzhou
SCIENTIA SINICA-PHYSICA MECHANICA & ASTRONOMICA, 2025, 55 (04)
[29] FormalEval: A Method for Automatic Evaluation of Code Generation via Large Language Models
Yang, Sichao
Yang, Ye
2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024, 2024, : 660 - 665
[30] Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models
Sarsa, Sami
Denny, Paul
Hellas, Arto
Leinonen, Juho
PROCEEDINGS OF THE 2022 ACM CONFERENCE ON INTERNATIONAL COMPUTING EDUCATION RESEARCH, ICER 2022, VOL. 1, 2023, : 27 - 43

← 1 2 3 4 5 →