Evaluating Code Comment Generation With Summarized API Docs

被引：0

作者：

Matmti, Bilel ^{[1
]}

Fard, Fatemeh ^{[1
]}

机构：

[1] Univ British Columbia, Dept Comp Sci, Okanagan, BC, Canada

来源：

2023 IEEE/ACM 2ND INTERNATIONAL WORKSHOP ON NATURAL LANGUAGE-BASED SOFTWARE ENGINEERING, NLBSE | 2023年

基金：

加拿大自然科学与工程研究理事会;

关键词：

API Docs; text summarization; comment generation; external knowledge source;

D O I：

10.1109/NLBSE59153.2023.00019

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Code comment generation is the task of generating a high-level natural language description for a given code snippet. API2Com is a comment generation model designed to leverage the Application Programming Interface Documentations (API Docs) as an external knowledge resource. Shahbazi et al. [1] showed that API Docs might help increase the model's performance. However, the model's performance in generating pertinent comments deteriorates due to the lengthy documentation used in the input as the number of APIs used in a method increases. In this paper, we propose to evaluate how summarizing the API Docs using an extractive text summarization technique, TextRank, will impact the overall performance of the API2Com. The results of our experiments using the same Java dataset confirm the inverse correlation between the number of APIs and the model's performance. As the number of APIs increases, the performance metrics tend to deteriorate for both configurations of the model, with or without API Docs summarization using TextRank. Experiments also show the impact of the number of APIs on TextRank algorithm capacity to improve the model performance. For example, with 8 APIs, TextRank summarization improved the model BLEU score by 18% on average, but the performance tends to decrease as the number of APIs increases. This demonstrates an open area of research to determine the winning combination in terms of the model configuration and the length of documentation used.

引用

页码：60 / 63

页数：4

共 50 条

[21] Exploring the impact of code review factors on the code review comment generation
Lu, Junyi
Li, Zhangyi
Shen, Chenjie
Yang, Li
Zuo, Chun
AUTOMATED SOFTWARE ENGINEERING, 2024, 31 (02)
[22] Summarized for you: The BEAUTIfUL Study Comment of the Expert
Eber, B.
JOURNAL FUR KARDIOLOGIE, 2009, 16 : 3 - 7
[23] API-Assisted Code Generation for Question Answering on Varied Table Structures
Cao, Yihan
Chen, Shuyi
Liu, Ryan
Wang, Zhiruo
Fried, Daniel
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14536 - 14548
[24] Snippet Comment Generation Based on Code Context Expansion
Guo, Hanyang
Chen, Xiangping
Huang, Yuan
Wang, Yanlin
Ding, Xi
Zheng, Zibin
Zhou, Xiaocong
Dai, Hong-Ning
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (01)
[25] SeTransformer: A Transformer-Based Code Semantic Parser for Code Comment Generation
Li, Zheng
Wu, Yonghao
Peng, Bin
Chen, Xiang
Sun, Zeyu
Liu, Yong
Paul, Doyle
IEEE TRANSACTIONS ON RELIABILITY, 2023, 72 (01) : 258 - 273
[26] Developer-Intent Driven Code Comment Generation
Mu, Fangwen
Chen, Xiao
Shi, Lin
Wang, Song
Wang, Qing
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 768 - 780
[27] SeCNN: A semantic CNN parser for code comment generation
Li, Zheng
Wu, Yonghao
Peng, Bin
Chen, Xiang
Sun, Zeyu
Liu, Yong
Yu, Deli
JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 181
[28] Automatic Comment Generation using only Source Code
Yildiz, Eren
Ekin, Emine
2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
[29] TAG : Type Auxiliary Guiding for Code Comment Generation
Cai, Ruichu
Liang, Zhihao
Xu, Boyan
Li, Zijian
Hao, Yuexing
Chen, Yao
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 291 - 301
[30] Towards Context-Aware Code Comment Generation
Yu, Xiaohan
Huang, Quzhe
Wang, Zheng
Feng, Yansong
Zhao, Dongyan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 3938 - 3947

← 1 2 3 4 5 →