An Extensive Study of the Structure Features in Transformer-based Code Semantic Summarization

被引:0
|
作者
Yang, Kang [1 ]
Mao, Xinjun [1 ]
Wang, Shangwen [1 ]
Qin, Yihao [1 ]
Zhang, Tanghaoran [1 ]
Lu, Yao [1 ]
Al-Sabahi, Kamal [2 ]
机构
[1] Natl Univ Def Technol, Key Lab Software Engn Complex Syst, Changsha, Peoples R China
[2] Univ Technol & Appl Sci ibra, Ibra, Oman
基金
美国国家科学基金会;
关键词
Transformer; empirical study; probing task; code summarization;
D O I
10.1109/ICPC58990.2023.00024
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Transformers are now widely utilized in code intelligence tasks. To better fit highly structured source code, various structure information is passed into Transformer, such as positional encoding and abstract syntax tree (AST) based structures. However, it is still not clear how these structural features affect code intelligence tasks, such as code summarization. Addressing this problem is of vital importance for designing Transformer-based code models. Existing works are keen to introduce various structural information into Transformers while lacking persuasive analysis to reveal their contributions and interaction effects. In this paper, we conduct an empirical study of frequently-used code structure features for code representation, including two types of position encoding features and AST-based structure features. We propose a couple of probing tasks to detect how these structure features perform in Transformer and conduct comprehensive ablation studies to investigate how these structural features affect code semantic summarization tasks. To further validate the effectiveness of code structure features in code summarization tasks, we assess Transformer models equipped with these code structure features on a structural dependent summarization dataset. Our experimental results reveal several findings that may inspire future study: (1) there is a conflict between the influence of the absolute positional embeddings and relative positional embeddings in Transformer; (2) AST-based code structure features and relative position encoding features show a strong correlation and much contribution overlap for code semantic summarization tasks indeed exists between them; (3) Transformer models still have space for further improvement in explicitly understanding code structure information.
引用
收藏
页码:89 / 100
页数:12
相关论文
共 50 条
  • [31] Transformer-based networks over tree structures for code classification
    Hua, Wei
    Liu, Guangzhong
    APPLIED INTELLIGENCE, 2022, 52 (08) : 8895 - 8909
  • [32] A Transformer-Based Approach for Smart Invocation of Automatic Code Completion
    de Moor, Aral
    van Deursen, Arie
    Izadi, Maliheh
    PROCEEDINGS OF THE 1ST ACM INTERNATIONAL CONFERENCE ON AI-POWERED SOFTWARE, AIWARE 2024, 2024, : 28 - 37
  • [33] Malicious Code Detection Based on Code Semantic Features
    Zhang, Yu
    Li, Binglong
    IEEE ACCESS, 2020, 8 : 176728 - 176737
  • [34] Extensive evaluation of transformer-based architectures for adverse drug events extraction
    Scaboro, Simone
    Portelli, Beatrice
    Chersoni, Emmanuele
    Santus, Enrico
    Serra, Giuseppe
    KNOWLEDGE-BASED SYSTEMS, 2023, 275
  • [35] Understanding the Robustness of Transformer-Based Code Intelligence via Code Transformation: Challenges and Opportunities
    Li, Yaoxian
    Qi, Shiyi
    Gao, Cuiyun
    Peng, Yun
    Lo, David
    Lyu, Michael R.
    Xu, Zenglin
    IEEE Transactions on Software Engineering,
  • [36] Experimental study on short-text clustering using transformer-based semantic similarity measure
    Abdalgader K.
    Matroud A.A.
    Hossin K.
    PeerJ Computer Science, 2024, 10
  • [37] Experimental study on short-text clustering using transformer-based semantic similarity measure
    Abdalgader, Khaled
    Matroud, Atheer A.
    Hossin, Khaled
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [38] Transformer-Based Decoder Designs for Semantic Segmentation on Remotely Sensed Images
    Panboonyuen, Teerapong
    Jitkajornwanich, Kulsawasd
    Lawawirojwong, Siam
    Srestasathiern, Panu
    Vateekul, Peerapon
    REMOTE SENSING, 2021, 13 (24)
  • [39] Evaluating Transformer-based Semantic Segmentation Networks for Pathological Image Segmentation
    Cam Nguyen
    Asad, Zuhayr
    Deng, Ruining
    Huo, Yuankai
    MEDICAL IMAGING 2022: IMAGE PROCESSING, 2022, 12032
  • [40] Transformer-based automated segmentation of recycling materials for semantic understanding in construction
    Wang, Xin
    Han, Wei
    Mo, Sicheng
    Cai, Ting
    Gong, Yijing
    Li, Yin
    Zhu, Zhenhua
    AUTOMATION IN CONSTRUCTION, 2023, 154