StructCoder: Structure-Aware Transformer for Code Generation

被引:0
|
作者
Tipirneni, Sindhu [1 ]
Zhu, Ming [1 ]
Reddy, Chandan K. [1 ]
机构
[1] Virginia Tech, Dept Comp Sci, 900 N Glebe Rd, Arlington, VA 22203 USA
关键词
Deep learning; language models; code generation; Transformer;
D O I
10.1145/3636430
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There has been a recent surge of interest in automating software engineering tasks using deep learning. This article addresses the problem of code generation, in which the goal is to generate target code given source code in a different language or a natural language description. Most state-of-the-art deep learning models for code generation use training strategies primarily designed for natural language. However, understanding and generating code requires a more rigorous comprehension of the code syntax and semantics. With this motivation, we develop an encoder-decoder Transformer model in which both the encoder and decoder are explicitly trained to recognize the syntax and dataflow in the source and target codes, respectively. We not only make the encoder structure aware by leveraging the source code's syntax tree and dataflow graph, but we also support the decoder in preserving the syntax and dataflow of the target code by introducing two novel auxiliary tasks: Abstract Syntax Tree (AST) path prediction and dataflow prediction. To the best of our knowledge, this is the first work to introduce a structure-aware Transformer decoder that models both syntax and dataflow to enhance the quality of generated code. The proposed StructCoder model achieves state-ofthe-art performance on code translation and text-to-code generation tasks in the CodeXGLUE benchmark and improves over baselines of similar size on the APPS code generation benchmark. Our code is publicly available at https://github.com/reddy- lab-code- research/StructCoder/.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Structure-aware QR Code abstraction
    Qiao, Siyuan
    Fang, Xiaoxin
    Sheng, Bin
    Wu, Wen
    Wu, Enhua
    [J]. VISUAL COMPUTER, 2015, 31 (6-8): : 1123 - 1133
  • [2] Structure-aware QR Code abstraction
    Siyuan Qiao
    Xiaoxin Fang
    Bin Sheng
    Wen Wu
    Enhua Wu
    [J]. The Visual Computer, 2015, 31 : 1123 - 1133
  • [3] Structure-Aware Transformer for Graph Representation Learning
    Chen, Dexiong
    O'Bray, Leslie
    Borgwardt, Karsten
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [4] Table Fact Verification with Structure-Aware Transformer
    Zhang, Hongzhi
    Wang, Yingyao
    Wang, Sirui
    Cao, Xuezhi
    Zhang, Fuzheng
    Wang, Zhongyuan
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1624 - 1629
  • [5] AST-T5: Structure-Aware Pretraining for Code Generation and Understanding
    Gong, Linyuan
    Elhoushi, Mostafa
    Cheung, Alvin
    [J]. arXiv,
  • [6] AST-T5: Structure-Aware Pretraining for Code Generation and Understanding
    Gong, Linyuan
    Elhoushi, Mostafa
    Cheung, Alvin
    [J]. Proceedings of Machine Learning Research, 2024, 235 : 15839 - 15853
  • [7] A Tree-Based Structure-Aware Transformer Decoder for Image-To-Markup Generation
    Zhong, Shuhan
    Song, Sizhe
    Li, Guanyao
    Chan, S-H Gary
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5751 - 5760
  • [8] Response Generation via Structure-Aware Constraints
    Guan, Mengyu
    Wang, Zhongqing
    Zhou, Guodong
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (06)
  • [9] Retrofitting Structure-aware Transformer Language Model for End Tasks
    Fei, Hao
    Ren, Yafeng
    Ji, Donghong
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2151 - 2161
  • [10] Structure-Aware Cross-Modal Transformer for Depth Completion
    Zhao, Linqing
    Wei, Yi
    Li, Jiaxin
    Zhou, Jie
    Lu, Jiwen
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1016 - 1031