Snippet Comment Generation Based on Code Context Expansion

被引:1
|
作者
Guo, Hanyang [1 ,2 ]
Chen, Xiangping [3 ]
Huang, Yuan [4 ]
Wang, Yanlin [4 ]
Ding, Xi [5 ]
Zheng, Zibin [4 ]
Zhou, Xiaocong [5 ]
Dai, Hong-Ning [2 ]
机构
[1] Sun Yat Sen Univ, Sch Software Engn, Guangzhou, Peoples R China
[2] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Peoples R China
[3] Sun Yat Sen Univ, Sch Commun & Design, Guangdong Key Lab Big Data Anal & Simulat Publ Op, Guangzhou, Peoples R China
[4] Sun Yat Sen Univ, Sch Software Engn, Zhuhai, Peoples R China
[5] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Snippet comment generation; code summarization; neural machine translation; contextual information;
D O I
10.1145/3611664
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code commenting plays an important role in program comprehension. Automatic comment generation helps improve software maintenance efficiency. The code comments to annotate a method mainly include header comments and snippet comments. The header comment aims to describe the functionality of the entire method, thereby providing a general comment at the beginning of the method. The snippet comment appears at multiple code segments in the body of a method, where a code segment is called a code snippet. Both of them help developers quickly understand code semantics, thereby improving code readability and code maintainability. However, existing automatic comment generation models mainly focus more on header comments, because there are public datasets to validate the performance. By contrast, it is challenging to collect datasets for snippet comments, because it is difficult to determine their scope. Even worse, code snippets are often too short to capture complete syntax and semantic information. To address this challenge, we propose a novel Snippet Comment Generation approach called SCGen. First, we utilize the context of the code snippet to expand the syntax and semantic information. Specifically, 600,243 snippet code-comment pairs are collected from 959 Java projects. Then, we capture variables from code snippets and extract variable-related statements from the context. After that, we devise an algorithm to parse and traverse abstract syntax tree (AST) information of code snippets and corresponding context. Finally, SCGen generates snippet comments after inputting the source code snippet and corresponding AST information into a sequence-to-sequence-based model. We conducted extensive experiments on the dataset we collected to evaluate our SCGen. Our approach obtains 18.23 in BLEU-4 metrics, 18.83 in METEOR, and 23.65 in ROUGE-L, which outperforms state-of-the-art comment generation models.
引用
收藏
页数:30
相关论文
共 50 条
  • [31] Spotting Familiar Code Snippet Structures for Program Comprehension
    Vinayakarao, Venkatesh
    2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS, 2015, : 1054 - 1056
  • [32] Fast and Practical Snippet Generation for RDF Datasets
    Liu, Daxin
    Cheng, Gong
    Liu, Qingxia
    Qu, Yuzhong
    ACM TRANSACTIONS ON THE WEB, 2019, 13 (04)
  • [34] eXtract: A Snippet Generation System for XML Search
    Huang, Yu
    Liu, Ziyang
    Chen, Yi
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02): : 1392 - 1395
  • [35] Target code generation using the code expansion technique for Java']Java Bytecode
    Ko, KM
    Kim, SG
    PARALLEL AND DISTRIBUTED COMPUTING: APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2004, 3320 : 752 - 755
  • [36] CCGIR: Information retrieval-based code comment generation method for smart contracts
    Yang, Guang
    Liu, Ke
    Chen, Xiang
    Zhou, Yanlin
    Yu, Chi
    Lin, Hao
    KNOWLEDGE-BASED SYSTEMS, 2022, 237
  • [37] Context effects of genetic code expansion by stop codon suppression
    Chemla, Yonatan
    Ozer, Eden
    Algov, Itay
    Alfonta, Lital
    CURRENT OPINION IN CHEMICAL BIOLOGY, 2018, 46 : 146 - 155
  • [38] Evaluating Code Comment Generation with Summarized API Docs
    Matmti, Bilel
    Fard, Fatemeh
    Proceedings - 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering, NLBSE 2023, 2023, : 60 - 63
  • [39] Developer-Intent Driven Code Comment Generation
    Mu, Fangwen
    Chen, Xiao
    Shi, Lin
    Wang, Song
    Wang, Qing
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 768 - 780
  • [40] SeCNN: A semantic CNN parser for code comment generation
    Li, Zheng
    Wu, Yonghao
    Peng, Bin
    Chen, Xiang
    Sun, Zeyu
    Liu, Yong
    Yu, Deli
    JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 181