An Extensive Study of the Structure Features in Transformer-based Code Semantic Summarization

被引:0
|
作者
Yang, Kang [1 ]
Mao, Xinjun [1 ]
Wang, Shangwen [1 ]
Qin, Yihao [1 ]
Zhang, Tanghaoran [1 ]
Lu, Yao [1 ]
Al-Sabahi, Kamal [2 ]
机构
[1] Natl Univ Def Technol, Key Lab Software Engn Complex Syst, Changsha, Peoples R China
[2] Univ Technol & Appl Sci ibra, Ibra, Oman
基金
美国国家科学基金会;
关键词
Transformer; empirical study; probing task; code summarization;
D O I
10.1109/ICPC58990.2023.00024
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Transformers are now widely utilized in code intelligence tasks. To better fit highly structured source code, various structure information is passed into Transformer, such as positional encoding and abstract syntax tree (AST) based structures. However, it is still not clear how these structural features affect code intelligence tasks, such as code summarization. Addressing this problem is of vital importance for designing Transformer-based code models. Existing works are keen to introduce various structural information into Transformers while lacking persuasive analysis to reveal their contributions and interaction effects. In this paper, we conduct an empirical study of frequently-used code structure features for code representation, including two types of position encoding features and AST-based structure features. We propose a couple of probing tasks to detect how these structure features perform in Transformer and conduct comprehensive ablation studies to investigate how these structural features affect code semantic summarization tasks. To further validate the effectiveness of code structure features in code summarization tasks, we assess Transformer models equipped with these code structure features on a structural dependent summarization dataset. Our experimental results reveal several findings that may inspire future study: (1) there is a conflict between the influence of the absolute positional embeddings and relative positional embeddings in Transformer; (2) AST-based code structure features and relative position encoding features show a strong correlation and much contribution overlap for code semantic summarization tasks indeed exists between them; (3) Transformer models still have space for further improvement in explicitly understanding code structure information.
引用
收藏
页码:89 / 100
页数:12
相关论文
共 50 条
  • [1] SeTransformer: A Transformer-Based Code Semantic Parser for Code Comment Generation
    Li, Zheng
    Wu, Yonghao
    Peng, Bin
    Chen, Xiang
    Sun, Zeyu
    Liu, Yong
    Paul, Doyle
    IEEE TRANSACTIONS ON RELIABILITY, 2023, 72 (01) : 258 - 273
  • [2] A Semantic and Structural Transformer for Code Summarization Generation
    Ji, Ruyi
    Tong, Zhenyu
    Luo, Tiejian
    Liu, Jing
    Zhang, Libo
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [3] A Multi-Modal Transformer-based Code Summarization Approach for Smart Contracts
    Yang, Zhen
    Keung, Jacky
    Yu, Xiao
    Gu, Xiaodong
    Wei, Zhengyuan
    Ma, Xiaoxue
    Zhang, Miao
    2021 IEEE/ACM 29TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2021), 2021, : 1 - 12
  • [4] Transformer-based Summarization by Exploiting Social Information
    Minh-Tien Nguyen
    Van-Chien Nguyen
    Huy-The Vu
    Van-Hau Nguyen
    2020 12TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (IEEE KSE 2020), 2020, : 25 - 30
  • [5] An Empirical Study of Code Smells in Transformer-based Code Generation Techniques
    Siddiq, Mohammed Latif
    Majumder, Shafayat H.
    Mim, Maisha R.
    Jajodia, Sourov
    Santos, Joanna C. S.
    2022 IEEE 22ND INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM 2022), 2022, : 71 - 82
  • [6] Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings
    Prakash, Prafull
    Shashidhar, Saurabh Kumar
    Zhao, Wenlong
    Rongali, Subendhu
    Khan, Haidar
    Kayser, Michael
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4711 - 4717
  • [7] Code Summarization with Structure-induced Transformer
    Wu, Hongqiu
    Zhao, Hai
    Zhang, Min
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1078 - 1090
  • [8] Code Structure-Guided Transformer for Source Code Summarization
    Gao, Shuzheng
    Gao, Cuiyun
    He, Yulan
    Zeng, Jichuan
    Nie, Lunyiu
    Xia, Xin
    Lyu, Michael
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (01)
  • [9] Applying Transformer-Based Text Summarization for Keyphrase Generation
    Glazkova A.V.
    Morozov D.A.
    Lobachevskii Journal of Mathematics, 2023, 44 (1) : 123 - 136
  • [10] TransRSS: Transformer-based Radar Semantic Segmentation
    Zou, Hao
    Xie, Zhen
    Ou, Jiarong
    Gao, Yutao
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 6965 - 6972