RetCom: Information Retrieval-Enhanced Automatic Source-Code Summarization

被引:0
|
作者
Zhang, Yubo [1 ]
Liu, Yanfang [1 ]
Fan, Xinxin [3 ]
Lu, Yunfeng [2 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China
[2] Beihang Univ, Sch Reliabil & Syst Engn, Beijing, Peoples R China
[3] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
关键词
Source-code summarization; deep learning; information retrieval;
D O I
10.1109/QRS57517.2022.00099
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With the purpose of saving the developing time of software engineers and promoting the work efficiency of programs, the research on automated source-code summarization (SCS) has become necessary in recent years, i.e. generating language descriptions for source code. To date, there exist two categories of SCS methods: information retrieval (IR)-based SCS and neural-based SCS. The latter is the mainstream method at present, however, this line of work suffers from the drawback of incapability to generate low-frequency words, which potentially degrades the performance. To tackle this predicament, we in this paper propose an IR-enhanced neural SCS method RetCom to improve the prediction of low-frequency words through leveraging both structural-level and semantic-level code retrievals. Furthermore, we figure out a token-level contextdependent mixture network to fuse different information sources, i.e. original code, structurally most similar code, and semantically most similar code. Finally, extensive experiments are performed to validate our proposed RetCom using two real-world datasets. Compared to several baseline methods, the experimental results show that our method does validly capture more low-frequency words to conduct a superior performance.
引用
收藏
页码:948 / 957
页数:10
相关论文
共 40 条
  • [1] Cross-Modal Retrieval-enhanced code Summarization based on joint learning for retrieval and generation
    Li, Lixuan
    Liang, Bin
    Chen, Lin
    Zhang, Xiaofang
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2024, 175
  • [2] Contextual Information Enhanced Source Code Summarization
    Hu, Tian-Xiang
    Xie, Rui
    Ye, Wei
    Zhang, Shi-Kun
    [J]. Ruan Jian Xue Bao/Journal of Software, 2023, 34 (04): : 1695 - 1710
  • [3] On Automatic Summarization of What and Why Information in Source Code Changes
    Shen, Jinfeng
    Sun, Xiaobing
    Li, Bin
    Yang, Hui
    Hu, Jiajun
    [J]. PROCEEDINGS 2016 IEEE 40TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS, VOL 1, 2016, : 103 - 112
  • [4] AUTOMATIC SOURCE-CODE PARALLELIZATION USING HICOR OBJECTS
    GILDER, MR
    KRISHNAMOORTHY, MS
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1994, 22 (03) : 303 - 350
  • [5] A review of automatic source code summarization
    Zhang, Xuejun
    Hou, Xia
    Qiao, Xiuming
    Song, Wenfeng
    [J]. Empirical Software Engineering, 2024, 29 (06)
  • [6] A Survey of Automatic Source Code Summarization
    Zhang, Chunyan
    Wang, Junchao
    Zhou, Qinglei
    Xu, Ting
    Tang, Ke
    Gui, Hairen
    Liu, Fudong
    [J]. SYMMETRY-BASEL, 2022, 14 (03):
  • [7] A Neural Framework for Retrieval and Summarization of Source Code
    Chen, Qingying
    Zhou, Minghui
    [J]. PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18), 2018, : 826 - 831
  • [8] Automatic Algorithm Recognition of Source-Code Using Machine Learning
    Shalaby, Maged
    Mehrez, Tarek
    El-Mougy, Amr
    Abdulnasser, Khalid
    Al-Safty, Aysha
    [J]. 2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 170 - 177
  • [9] Retrieval-based Neural Source Code Summarization
    Zhang, Jian
    Wang, Xu
    Zhang, Hongyu
    Sun, Hailong
    Liu, Xudong
    [J]. 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 1385 - 1397
  • [10] Automatic Documentation Generation via Source Code Summarization
    McBurney, Paul W.
    [J]. 2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 2, 2015, : 903 - 906