Evaluating Code Comment Generation With Summarized API Docs

被引:0
|
作者
Matmti, Bilel [1 ]
Fard, Fatemeh [1 ]
机构
[1] Univ British Columbia, Dept Comp Sci, Okanagan, BC, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
API Docs; text summarization; comment generation; external knowledge source;
D O I
10.1109/NLBSE59153.2023.00019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Code comment generation is the task of generating a high-level natural language description for a given code snippet. API2Com is a comment generation model designed to leverage the Application Programming Interface Documentations (API Docs) as an external knowledge resource. Shahbazi et al. [1] showed that API Docs might help increase the model's performance. However, the model's performance in generating pertinent comments deteriorates due to the lengthy documentation used in the input as the number of APIs used in a method increases. In this paper, we propose to evaluate how summarizing the API Docs using an extractive text summarization technique, TextRank, will impact the overall performance of the API2Com. The results of our experiments using the same Java dataset confirm the inverse correlation between the number of APIs and the model's performance. As the number of APIs increases, the performance metrics tend to deteriorate for both configurations of the model, with or without API Docs summarization using TextRank. Experiments also show the impact of the number of APIs on TextRank algorithm capacity to improve the model performance. For example, with 8 APIs, TextRank summarization improved the model BLEU score by 18% on average, but the performance tends to decrease as the number of APIs increases. This demonstrates an open area of research to determine the winning combination in terms of the model configuration and the length of documentation used.
引用
收藏
页码:60 / 63
页数:4
相关论文
共 50 条
  • [21] Exploring the impact of code review factors on the code review comment generation
    Lu, Junyi
    Li, Zhangyi
    Shen, Chenjie
    Yang, Li
    Zuo, Chun
    AUTOMATED SOFTWARE ENGINEERING, 2024, 31 (02)
  • [22] Summarized for you: The BEAUTIfUL Study Comment of the Expert
    Eber, B.
    JOURNAL FUR KARDIOLOGIE, 2009, 16 : 3 - 7
  • [23] API-Assisted Code Generation for Question Answering on Varied Table Structures
    Cao, Yihan
    Chen, Shuyi
    Liu, Ryan
    Wang, Zhiruo
    Fried, Daniel
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14536 - 14548
  • [24] Snippet Comment Generation Based on Code Context Expansion
    Guo, Hanyang
    Chen, Xiangping
    Huang, Yuan
    Wang, Yanlin
    Ding, Xi
    Zheng, Zibin
    Zhou, Xiaocong
    Dai, Hong-Ning
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (01)
  • [25] SeTransformer: A Transformer-Based Code Semantic Parser for Code Comment Generation
    Li, Zheng
    Wu, Yonghao
    Peng, Bin
    Chen, Xiang
    Sun, Zeyu
    Liu, Yong
    Paul, Doyle
    IEEE TRANSACTIONS ON RELIABILITY, 2023, 72 (01) : 258 - 273
  • [26] Developer-Intent Driven Code Comment Generation
    Mu, Fangwen
    Chen, Xiao
    Shi, Lin
    Wang, Song
    Wang, Qing
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 768 - 780
  • [27] SeCNN: A semantic CNN parser for code comment generation
    Li, Zheng
    Wu, Yonghao
    Peng, Bin
    Chen, Xiang
    Sun, Zeyu
    Liu, Yong
    Yu, Deli
    JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 181
  • [28] Automatic Comment Generation using only Source Code
    Yildiz, Eren
    Ekin, Emine
    2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [29] TAG : Type Auxiliary Guiding for Code Comment Generation
    Cai, Ruichu
    Liang, Zhihao
    Xu, Boyan
    Li, Zijian
    Hao, Yuexing
    Chen, Yao
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 291 - 301
  • [30] Towards Context-Aware Code Comment Generation
    Yu, Xiaohan
    Huang, Quzhe
    Wang, Zheng
    Feng, Yansong
    Zhao, Dongyan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 3938 - 3947