Latent Execution for Neural Program Synthesis

被引：0

作者：

Chen, Xinyun ^{[1
]}

Song, Dawn ^{[1
]}

Tian, Yuandong ^{[2
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Facebook AI Res, New York, NY USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Program synthesis from input-output (IO) examples has been a long-standing challenge. While recent works demonstrated limited success on domain-specific languages (DSL), it remains highly challenging to apply them to real-world programming languages, such as C. Due to complicated syntax and token variation, there are three major challenges: (1) unlike many DSLs, programs in languages like C need to compile first and are not executed via interpreters; (2) the program search space grows exponentially when the syntax and semantics of the programming language become more complex; and (3) collecting a large-scale dataset of real-world programs is non-trivial. As a first step to address these challenges, we propose LaSynth and show its efficacy in a restricted-C domain (i.e., C code with tens of tokens, with sequential, branching, loop and simple arithmetic operations but no library call). More specifically, LaSynth learns the latent representation to approximate the execution of partially generated programs, even if they are incomplete in syntax (addressing (1)). The learned execution significantly improves the performance of next token prediction over existing approaches, facilitating search (addressing (2)). Finally, once trained with randomly generated groundtruth programs and their IO pairs, LaSynth can synthesize more concise programs that resemble human-written code. Furthermore, retraining our model with these synthesized programs yields better performance with fewer samples for both Karel and C program synthesis, indicating the promise of leveraging the learned program synthesizer to improve the dataset quality for input-output program synthesis (addressing (3)). When evaluating on whether the program execution outputs match the IO pairs, LaSynth achieves 55.2% accuracy on generating simple C code with tens of tokens including loops and branches, outperforming existing approaches without executors by around 20%.

引用

页数：13

共 50 条

[41] Efficient Generation of Program Execution Hash
Ahn, Eunyeong
Kim, Sunjin
Park, Saerom
Hou, Jong-Uk
Jang, Daehee
IEEE ACCESS, 2022, 10 : 61707 - 61720
[42] THE EFFECTS OF CHECKPOINTING ON PROGRAM EXECUTION TIME
DUDA, A
INFORMATION PROCESSING LETTERS, 1983, 16 (05) : 221 - 229
[43] PROCESSOR SCHEDULING IN PARALLEL PROGRAM EXECUTION
FANG, J
PROCEEDINGS : THE THIRTEENTH ANNUAL INTERNATIONAL COMPUTER SOFTWARE & APPLICATIONS CONFERENCE, 1989, : 112 - 113
[44] Chained Signatures For Secure Program Execution
Arnautov, Sergei
Fetzer, Christof
2014 IEEE 33RD INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS), 2014, : 343 - 344
[45] Autotuning of configuration for program execution in GPUs
Balaiah, Thanasekhar
Parthasarathi, Ranjani
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (09):
[46] Visualizing the impact of the cache on program execution
Yu, YJ
Beyls, K
D'Hollander, EH
FIFTH INTERNATIONAL CONFERENCE ON INFORMATION VISUALISATION, PROCEEDINGS, 2001, : 336 - 341
[47] APPLICATIONS OF SYMBOLIC EXECUTION TO PROGRAM TESTING
DARRINGER, JA
KING, JC
COMPUTER, 1978, 11 (04) : 51 - 59
[48] A model of dynamical concurrent program execution
Vasenin, V. A.
Krivchikov, M. A.
PROGRAMMING AND COMPUTER SOFTWARE, 2013, 39 (01) : 1 - 9
[49] A model of dynamical concurrent program execution
V. A. Vasenin
M. A. Krivchikov
Programming and Computer Software, 2013, 39 : 1 - 9
[50] Schematic Program Proofs with Abstract Execution
Steinhoefel, Dominic
Haehnle, Reiner
JOURNAL OF AUTOMATED REASONING, 2024, 68 (02)

← 1 2 3 4 5 →