Snippet Comment Generation Based on Code Context Expansion

被引:1
|
作者
Guo, Hanyang [1 ,2 ]
Chen, Xiangping [3 ]
Huang, Yuan [4 ]
Wang, Yanlin [4 ]
Ding, Xi [5 ]
Zheng, Zibin [4 ]
Zhou, Xiaocong [5 ]
Dai, Hong-Ning [2 ]
机构
[1] Sun Yat Sen Univ, Sch Software Engn, Guangzhou, Peoples R China
[2] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Peoples R China
[3] Sun Yat Sen Univ, Sch Commun & Design, Guangdong Key Lab Big Data Anal & Simulat Publ Op, Guangzhou, Peoples R China
[4] Sun Yat Sen Univ, Sch Software Engn, Zhuhai, Peoples R China
[5] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Snippet comment generation; code summarization; neural machine translation; contextual information;
D O I
10.1145/3611664
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code commenting plays an important role in program comprehension. Automatic comment generation helps improve software maintenance efficiency. The code comments to annotate a method mainly include header comments and snippet comments. The header comment aims to describe the functionality of the entire method, thereby providing a general comment at the beginning of the method. The snippet comment appears at multiple code segments in the body of a method, where a code segment is called a code snippet. Both of them help developers quickly understand code semantics, thereby improving code readability and code maintainability. However, existing automatic comment generation models mainly focus more on header comments, because there are public datasets to validate the performance. By contrast, it is challenging to collect datasets for snippet comments, because it is difficult to determine their scope. Even worse, code snippets are often too short to capture complete syntax and semantic information. To address this challenge, we propose a novel Snippet Comment Generation approach called SCGen. First, we utilize the context of the code snippet to expand the syntax and semantic information. Specifically, 600,243 snippet code-comment pairs are collected from 959 Java projects. Then, we capture variables from code snippets and extract variable-related statements from the context. After that, we devise an algorithm to parse and traverse abstract syntax tree (AST) information of code snippets and corresponding context. Finally, SCGen generates snippet comments after inputting the source code snippet and corresponding AST information into a sequence-to-sequence-based model. We conducted extensive experiments on the dataset we collected to evaluate our SCGen. Our approach obtains 18.23 in BLEU-4 metrics, 18.83 in METEOR, and 23.65 in ROUGE-L, which outperforms state-of-the-art comment generation models.
引用
收藏
页数:30
相关论文
共 50 条
  • [21] Adversarial Robustness of Deep Code Comment Generation
    Zhou, Yu
    Zhang, Xiaoqing
    Shen, Juanjuan
    Han, Tingting
    Chen, Taolue
    Gall, Harald
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2022, 31 (04)
  • [22] Practitioners' Expectations on Automated Code Comment Generation
    Hu, Xing
    Xia, Xin
    Lo, David
    Wan, Zhiyuan
    Chen, Qiuyuan
    Zimmermann, Thomas
    2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 1693 - 1705
  • [23] Automated Snippet Generation for Online Advertising
    Thomaidou, Stamatina
    Lourentzou, Ismini
    Katsivelis-Perakis, Panagiotis
    Vazirgiannis, Michalis
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1841 - 1844
  • [24] Learning Code Context Information to Predict Comment Locations
    Huang, Yuan
    Hu, Xinyu
    Jia, Nan
    Chen, Xiangping
    Xiong, Yingfei
    Zheng, Zibin
    IEEE TRANSACTIONS ON RELIABILITY, 2020, 69 (01) : 88 - 105
  • [25] Neural Comment Generation for Source Code with Auxiliary Code Classification Task
    Chen, Minghao
    Wan, Xiaojun
    2019 26TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC), 2019, : 522 - 529
  • [26] Exploring the impact of code review factors on the code review comment generation
    Lu, Junyi
    Li, Zhangyi
    Shen, Chenjie
    Yang, Li
    Zuo, Chun
    AUTOMATED SOFTWARE ENGINEERING, 2024, 31 (02)
  • [27] Code line generation based on deep context-awareness of onsite programming
    Chuanqi Tao
    Panpan Bao
    Zhiqiu Huang
    Science China Information Sciences, 2020, 63
  • [28] Layer Modeling and Its Code Generation based on Context-oriented Programming
    Yamamoto, Chinatsu
    Tanigawa, Ikuta
    Hisazumi, Kenji
    Sato, Mikiko
    Ohkawa, Takeshi
    Ogura, Nobuhiko
    Watanabe, Harumi
    PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON MODEL-DRIVEN ENGINEERING AND SOFTWARE DEVELOPMENT (MODELSWARD), 2021, : 330 - 336
  • [29] Code line generation based on deep context-awareness of onsite programming
    Tao, Chuanqi
    Bao, Panpan
    Huang, Zhiqiu
    SCIENCE CHINA-INFORMATION SCIENCES, 2020, 63 (09)
  • [30] Code line generation based on deep context-awareness of onsite programming
    Chuanqi TAO
    Panpan BAO
    Zhiqiu HUANG
    ScienceChina(InformationSciences), 2020, 63 (09) : 64 - 66