Linguacodus: a synergistic framework for transformative code generation in machine learning pipelines

被引:0
|
作者
Trofirmova, Ekaterina [1 ]
Sataev, Emil [1 ]
Ustyuzhanin, Andrey [2 ,3 ]
机构
[1] Higher Sch Econ, Fac Comp Sci, Moscow, Russia
[2] Natl Univ Singapore, IFIM, Singapore, Singapore
[3] Constructor Univ, Sch Comp Sci & Engn, Bremen, Germany
基金
俄罗斯科学基金会;
关键词
Automated code generation; Large language models; Machine learning pipelines;
D O I
10.7717/peerj-cs.2328
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the ever-evolving landscape of machine learning, seamless translation of natural language descriptions into executable code remains a formidable challenge. This article introduces Linguacodus, an innovative framework designed to tackle this challenge by deploying a dynamic pipeline that iteratively transforms natural language task descriptions into code through high-level data-shaping instructions. The core of Linguacodus is a fine-tuned large language model, empowered to evaluate diverse solutions for various problems and select the most fitting one for a given task. This article details the fine-tuning process and sheds light on how natural language descriptions can be translated into functional code. Linguacodus represents a substantial leap towards automated code generation, effectively bridging the gap between task descriptions and executable code. It holds great promise for advancing machine learning applications across diverse domains. Additionally, we propose an algorithm capable of transforming a natural description of an ML task into code with minimal human interaction. In extensive experiments on a vast machine learning code dataset originating from Kaggle, we showcase the effectiveness of Linguacodus. The investigations highlight its potential applications across diverse domains, emphasizing its impact on applied machine learning in various scientific fields.
引用
收藏
页数:32
相关论文
共 50 条
  • [1] Automatic Generation of Visualizations for Machine Learning Pipelines
    Liu, Lei
    Chen, Wei-Peng
    Bahrami, Mehdi
    Prasad, Mukul
    [J]. PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
  • [2] DeeperCoder: Code Generation Using Machine Learning
    Shim, Simon
    Patil, Pradnyesh
    Yadav, Rajiv Ramesh
    Shinde, Anurag
    Devale, Venkatesh
    [J]. 2020 10TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2020, : 194 - 199
  • [3] A machine learning based framework for code clone validation
    Mostaeen, Golam
    Roy, Banani
    Roy, Chanchal K.
    Schneider, Kevin
    Svajlenko, Jeffrey
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2020, 169
  • [4] Code Generation Using Machine Learning: A Systematic Review
    Dehaerne, Enrique
    Dey, Bappaditya
    Halder, Sandip
    De Gendt, Stefan
    Meert, Wannes
    [J]. IEEE ACCESS, 2022, 10 : 82434 - 82455
  • [5] Code Generation by Example Using Symbolic Machine Learning
    Lano K.
    Xue Q.
    [J]. SN Computer Science, 4 (2)
  • [6] Code Generation Using Machine Learning: A Systematic Review
    Dehaerne, Enrique
    Dey, Bappaditya
    Halder, Sandip
    De Gendt, Stefan
    Meert, Wannes
    [J]. IEEE Access, 2022, 10 : 82434 - 82455
  • [7] Automating Code Generation for MDE using Machine Learning
    Xue, Qiaomu
    [J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 221 - 223
  • [8] Synergistic effects between data corpora properties and machine learning performance in data pipelines
    Bertolini, Roberto
    Finch, Stephen J.
    [J]. INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2022, 14 (03) : 217 - 233
  • [9] On the Democratization of Machine Learning Pipelines
    Carqueja, Alexandre
    Cabral, Bruno
    Fernandes, Joao Paulo
    Lourenco, Nuno
    [J]. 2022 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2022, : 455 - 462
  • [10] deepKnit: Learning-based Generation of Machine Knitting Code
    Scheidt, Fabian
    Ou, Jifei
    Ishii, Hiroshi
    Meisen, Tobias
    [J]. 30TH INTERNATIONAL CONFERENCE ON FLEXIBLE AUTOMATION AND INTELLIGENT MANUFACTURING (FAIM2021), 2020, 51 : 485 - 492