Cnerator: A Python']Python application for the controlled stochastic generation of standard C source code

被引:1
|
作者
Ortin, Francisco [1 ,2 ]
Escalada, Javier [1 ]
机构
[1] Univ Oviedo, Comp Sci Dept, Federico Garcia Lorca 18, Oviedo 33007, Spain
[2] Munster Technol Univ, Dept Comp Sci, Rossa Ave, Cork, Ireland
关键词
Big code; Mining software repositories; Machine learning; C programming language; Stochastic program generation; !text type='Python']Python[!/text;
D O I
10.1016/j.softx.2021.100711
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The Big Code and Mining Software Repositories research lines analyze large amounts of source code to improve software engineering practices. Massive codebases are used to train machine learning models aimed at improving the software development process. One example is decompilation, where C code and its compiled binaries can be used to train machine learning models to improve decompilation. However, obtaining massive codebases of portable C code is not an easy task, since most applications use particular libraries, operating systems, or language extensions. In this paper, we present Cnerator, a Python application that provides the stochastic generation of large amounts of standard C code. It is highly configurable, allowing the user to specify the probability distributions of each language construct, properties of the generated code, and post-processing modifications of the output programs. Cnerator has been successfully used to generate code that, utilized to train machine learning models, has improved the performance of existing decompilers. It has also been used in the implementation of an infrastructure for the automatic extraction of code patterns. (C) 2021 The Author(s). Published by Elsevier B.V.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Code2graph: Automatic Generation of Static Call Graphs for Python']Python Source Code
    Gharibi, Gharib
    Tripathi, Rashmi
    Lee, Yugyung
    [J]. PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18), 2018, : 880 - 883
  • [2] Python']Python Code Generation for Explicit MPC in MPT
    Takacs, Balint
    Stevek, Juraj
    Valo, Richard
    Kvasnica, Michal
    [J]. 2016 EUROPEAN CONTROL CONFERENCE (ECC), 2016, : 1328 - 1333
  • [3] Python']Python Code Generation by Asking Clarification Questions
    Li, Haau-Sing
    Mesgar, Mohsen
    Martins, Andre F. T.
    Gurevych, Iryna
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14287 - 14306
  • [4] Seq2Code: Transformer-Based Encoder-Decoder Model for Python']Python Source Code Generation
    Laskari, Naveen Kumar
    Reddy, K. Adi Narayana
    Reddy, M. Indrasena
    [J]. THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 301 - 309
  • [5] Chaos to Clarity with Semantic Inferencing for Python']Python Source Code Snippets
    Stein, Aviel
    Mancoridis, Spiros
    [J]. 2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 161 - 166
  • [6] Machine Learning Techniques For Python']Python Source Code Vulnerability Detection
    Farasat, Talaya
    Posegga, Joachim
    [J]. PROCEEDINGS OF THE FOURTEENTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, CODASPY 2024, 2024, : 151 - 153
  • [7] Python']Python source code vulnerability detection with named entity recognition
    Ehrenberg, Melanie
    Sarkani, Shahram
    Mazzuchi, Thomas A.
    [J]. COMPUTERS & SECURITY, 2024, 140
  • [8] GAP-Gen: Guided Automatic Python']Python Code Generation
    Zhao, Junchen
    Song, Yurun
    Wang, Junlin
    Harris, Ian G.
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 37 - 51
  • [9] Transformers based Python']Python Code Generation from Natural Language
    Swathi, Smt E.
    Vanga, Abhinav Reddy
    [J]. 2024 5TH INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN INFORMATION TECHNOLOGY, ICITIIT 2024, 2024,
  • [10] Code Analysis with Static Application Security Testing for Python']Python Program
    Ma, Li
    Yang, Huihong
    Xu, Jianxiong
    Yang, Zexian
    Lao, Qidi
    Yuan, Dong
    [J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2022, 94 (11): : 1169 - 1182