On ML-Based Program Translation: Perils and Promises

被引：1

作者：

Malyala, Aniketh ^{[1
]}

Zhou, Katelyn ^{[1
]}

Ray, Baishakhi ^{[2
]}

Chakraborty, Saikat ^{[3
]}

机构：

[1] Silver Creek High Sch, San Jose, CA 95121 USA

[2] Columbia Univ, New York, NY USA

[3] Microsoft Res, Redmond, WA USA

来源：

2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING-NEW IDEAS AND EMERGING RESULTS, ICSE-NIER | 2023年

关键词：

Code generation; code translation; program transformation;

D O I：

10.1109/ICSE-NIER58687.2023.00017

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

With the advent of new and advanced programming languages, it becomes imperative to migrate legacy software to new programming languages. Unsupervised Machine Learning-based Program Translation could play an essential role in such migration, even without a sufficiently sizeable reliable corpus of parallel source code. However, these translators are far from perfect due to their statistical nature. This work investigates unsupervised program translators and where and why they fail. With in-depth error analysis of such failures, we have identified that the cases where such translators fail follow a few particular patterns. With this insight, we develop a rule-based program mutation engine, which pre-processes the input code if the input follows specific patterns and post-process the output if the output follows certain patterns. We show that our code processing tool, in conjunction with the program translator, can form a hybrid program translator and significantly improve the state-of-the-art. In the future, we envision an end-to-end program translation tool where programming domain knowledge can be embedded into an ML-based translation pipeline using pre- and post-processing steps.

引用

页码：60 / 65

页数：6

共 50 条

[41] ML-Based Fringe-Frequency Estimation for InSAR
Guarnieri, Andrea Monti
Tebaldini, Stefano
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2010, 7 (01) : 136 - 140
[42] ML-based modulation identification algorithm based on joint parameter estimation
School of Electronic Engineering, Xidian University, Xi'an 710071, China
Yi Qi Yi Biao Xue Bao, 2008, 12 (2509-2514):
[43] Aggregate-based Training Phase for ML-based Cardinality Estimation
Woltmann, Lucas
Hartmann, Claudio
Habich, Dirk
Lehner, Wolfgang
Datenbank-Spektrum, 2022, 22 (01) : 45 - 57
[44] ML-based clinical decision support models based on metabolomics data
Burdukiewicz, Michal
Chilimoniuk, Jaroslaw
Grzesiak, Krystyna
Kretowski, Adam
Ciborowski, Michal
TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2024, 178
[45] The Promises and Perils of Foundation Models in
Gui, Haiwen
Omiye, Jesutofunmi A.
Chang, Crystal T.
Daneshjou, Roxana
JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2024, 144 (07) : 1440 - 1448
[46] ETHNIC EDUCATION - PROMISES AND PERILS
WELISCH, SA
EDUCATIONAL FORUM, 1976, 40 (04): : 543 - 550
[47] Plant microscopy: Perils and promises
Blancaflor, Elison B.
IN VITRO CELLULAR & DEVELOPMENTAL BIOLOGY-ANIMAL, 2008, 44 : S25 - S26
[48] PROMISES AND PERILS OF CHRISTIAN POLITICS
KIRK, R
INTERCOLLEGIATE REVIEW, 1982, 18 (01): : 13 - 23
[49] Promises and perils of positionality statements
King, Kendall A.
ANNUAL REVIEW OF APPLIED LINGUISTICS, 2024,
[50] The Promises and Perils of Corporate Purpose
Kaplan, Sarah
STRATEGY SCIENCE, 2023, 8 (02) : 288 - 301

← 1 2 3 4 5 →