Code Generation Using Machine Learning: A Systematic Review

被引:0
|
作者
Dehaerne, Enrique [1 ,2 ]
Dey, Bappaditya [2 ]
Halder, Sandip [2 ]
De Gendt, Stefan [1 ,3 ]
Meert, Wannes [1 ]
机构
[1] Ku Leuven, Department of Computer Science, Leuven,3001, Belgium
[2] Interuniversity Microelectronics Centre (IMEC), Leuven,3001, Belgium
[3] Ku Leuven, Department of Chemistry, Leuven,3001, Belgium
关键词
Application programs - Automatic programming - Computer programming languages - Computer systems programming - Data mining - Database systems - Learning algorithms - Learning systems - Natural language processing systems - Network architecture - Program debugging - Recurrent neural networks - Software design;
D O I
暂无
中图分类号
学科分类号
摘要
Recently, machine learning (ML) methods have been used to create powerful language models for a broad range of natural language processing tasks. An important subset of this field is that of generating code of programming languages for automatic software development. This review provides a broad and detailed overview of studies for code generation using ML. We selected 37 publications indexed in arXiv and IEEE Xplore databases that train ML models on programming language data to generate code. The three paradigms of code generation we identified in these studies are description-to-code, code-to-description, and code-to-code. The most popular applications that work in these paradigms were found to be code generation from natural language descriptions, documentation generation, and automatic program repair, respectively. The most frequently used ML models in these studies include recurrent neural networks, transformers, and convolutional neural networks. Other neural network architectures, as well as non-neural techniques, were also observed. In this review, we have summarized the applications, models, datasets, results, limitations, and future work of 37 publications. Additionally, we include discussions on topics general to the literature reviewed. This includes comparing different model types, comparing tokenizers, the volume and quality of data used, and methods for evaluating synthesized code. Furthermore, we provide three suggestions for future work for code generation using ML. © 2013 IEEE.
引用
收藏
页码:82434 / 82455
相关论文
共 50 条
  • [1] Code Generation Using Machine Learning: A Systematic Review
    Dehaerne, Enrique
    Dey, Bappaditya
    Halder, Sandip
    De Gendt, Stefan
    Meert, Wannes
    [J]. IEEE ACCESS, 2022, 10 : 82434 - 82455
  • [2] DeeperCoder: Code Generation Using Machine Learning
    Shim, Simon
    Patil, Pradnyesh
    Yadav, Rajiv Ramesh
    Shinde, Anurag
    Devale, Venkatesh
    [J]. 2020 10TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2020, : 194 - 199
  • [3] Code Generation by Example Using Symbolic Machine Learning
    Lano K.
    Xue Q.
    [J]. SN Computer Science, 4 (2)
  • [4] Automating Code Generation for MDE using Machine Learning
    Xue, Qiaomu
    [J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 221 - 223
  • [5] A systematic review of code generation proposals from state machine specifications
    Dominguez, Eladio
    Perez, Beatriz
    Rubio, Angel L.
    Zapata, Maria A.
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2012, 54 (10) : 1045 - 1066
  • [6] A Survey on Source Code Review Using Machine Learning
    Wang Xiaomeng
    Zhang Tao
    Xin Wei
    Hou Changyu
    [J]. 2018 3RD INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS ENGINEERING (ICISE), 2018, : 56 - 60
  • [7] Language learning using Machine Learning: a systematic review
    Cruzado, Javier Gamboa
    Huamani-Jeri, Jhon
    Najarro-Buitron, Abel
    Sanchez, Augusto Hidalgo
    Chaca, Marisol Daga
    Zegarra, Indalecio Horna
    [J]. APUNTES UNIVERSITARIOS, 2022, 12 (04) : 321 - 345
  • [8] Machine Learning Approaches for Code Smell Detection: A Systematic Literature Review
    Grujić, Katarina-Glorija
    Prokić, Simona
    Kovačević, Aleksandar
    Luburić, Nikola
    Vidaković, Dragan
    Slivka, Jelena
    [J]. SSRN, 2022,
  • [9] A systematic literature review on the use of machine learning in code clone research
    Kaur, Manpreet
    Rattan, Dhavleesh
    [J]. COMPUTER SCIENCE REVIEW, 2023, 47
  • [10] Using Machine Learning for Pharmacovigilance: A Systematic Review
    Pilipiec, Patrick
    Liwicki, Marcus
    Bota, Andras
    [J]. PHARMACEUTICS, 2022, 14 (02)