Multi-stage guided code generation for Large Language Models

被引:0
|
作者
Han, Yewei [1 ]
Lyu, Chen [1 ]
机构
[1] Shandong Normal Univ, Sch Informat Sci & Engn, Shandong Prov Key Lab Distributed Comp Software No, Univ Rd 1, Jinan, Peoples R China
关键词
Code generation; Multi-stage; Large Language Models; Prompt technique;
D O I
10.1016/j.engappai.2024.109491
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Currently, although Large Language Models (LLMs) have shown significant performance in the field of code generation, their effectiveness in handling complex programming tasks remains limited. This is primarily due to the substantial distance between the problem description and the correct code, making it difficult to ensure accuracy when directly generating code. Human programmers, when faced with a complex programming problem, usually use multiple stages to solve it in order to reduce the difficulty of development. First, they analyze the problem and think about a solution plan, then they design a code architecture based on that plan, and finally they finish writing the detailed code. Based on this, we propose a multi-stage guided code generation strategy that aims to gradually shorten the transformation distance between the problem description and the correct code, thus improving the accuracy of code generation. Specifically, the approach consists of three stages: planning, design and implementation. In the planning phase, the Large Language Model (LLM) generates a solution plan based on the problem description; in the design phase, the code architecture is further designed based on the solution plan; and in the implementation phase, the previous solution plan and code architecture are utilized to guide the LLM in generating the final code. Additionally, we found that existing competition-level code generation benchmarks may overlap with the training data of the Chat Generative Pre-trained Transformer (ChatGPT), posing a risk of data leakage. To validate the above findings and circumvent this risk, we created a competition-level code generation dataset named CodeC, which contains data never used for training ChatGPT. Experimental results show that our method outperforms the most advanced baselines. On the CodeC dataset, our approach achieves a 34.7% relative improvement on the Pass@1 metric compared to the direct generation method of ChatGPT. We have published the relevant dataset at https://github.com/hcode666/MSG for further academic research and validation.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Hot or Cold? Adaptive Temperature Sampling for Code Generation with Large Language Models
    Zhu, Yuqi
    Li, Jia
    Li, Ge
    Zhao, YunFei
    Li, Jia
    Jin, Zhi
    Mei, Hong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 437 - 445
  • [32] Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model
    Liu, Ting
    Shi, Liangtao
    Hong, Richang
    Hu, Yue
    Yin, Quanjun
    Zhang, Linfeng
    arXiv,
  • [33] Multi-Stage Prompting for Knowledgeable Dialogue Generation
    Liu, Zihan
    Patwary, Mostofa
    Prenger, Ryan
    Prabhumoye, Shrimai
    Ping, Wei
    Shoeybi, Mohammad
    Catanzaro, Bryan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1317 - 1337
  • [34] An effective multi-stage background generation algorithm
    Bevilacqua, A
    Di Stefano, L
    Lanza, A
    AVSS 2005: ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, PROCEEDINGS, 2005, : 388 - 393
  • [35] Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
    Liu, Jiawei
    Xia, Chunqiu Steven
    Wang, Yuyao
    Zhang, Lingming
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [36] Multi-stage Programming in the Large with Staged Classes
    Parreaux, Lionel
    Shaikhha, Amir
    GPCE '2020: PROCEEDINGS OF THE 19TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON GENERATIVE PROGRAMMING: CONCEPTS AND EXPERIENCES, 2020, : 35 - 49
  • [37] MULTI-STAGE DECISION MAKING MODELS IN PSYCHOLOGY
    KLEITER, GD
    PSYCHOLOGISCHE BEITRAGE, 1974, 16 (01): : 93 - 127
  • [38] Multi-stage validation of pesticide leaching models
    Armstrong, AC
    Portwood, AM
    Harris, GL
    LeedsHarrison, PB
    Catt, JA
    ENVIRONMENTAL FATE OF XENOBIOTICS, 1996, : 321 - 328
  • [39] L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models
    Ni, Ansong
    Yin, Pengcheng
    Zhao, Yilun
    Riddell, Martin
    Feng, Troy
    Shen, Rui
    Yin, Stephen
    Liu, Ye
    Yavuz, Semih
    Xiong, Caiming
    Joty, Shafiq
    Zhou, Yingbo
    Radev, Dragomir
    Cohan, Arman
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1311 - 1329
  • [40] Experiences with an object-oriented, multi-stage language
    Neverov, Gregory
    Roe, Paul
    SCIENCE OF COMPUTER PROGRAMMING, 2006, 62 (01) : 85 - 94