AUGER: Automatically Generating Review Comments with Pre-training Models

被引：11

作者：

Li, Lingwei ^{[1
]}

Yang, Li ^{[2
]}

Jiang, Huaxi ^{[1
]}

Yan, Jun ^{[3
]}

Luo, Tiejian ^{[4
]}

Hua, Zihan ^{[5
]}

Liang, Geng ^{[2
]}

Zuo, Chun ^{[6
]}

机构：

[1] Univ Chinese Acad Sci, CAS, Inst Software, Beijing, Peoples R China

[2] Chinese Acad Sci, Inst Software, Beijing, Peoples R China

[3] Univ Chinese Acad Sci, CAS, Inst Software, State Key Lab Comp Sci, Beijing, Peoples R China

[4] Univ Chinese Acad Sci, Beijing, Peoples R China

[5] Univ Chinese Acad Sci, Wuhan Univ, Wuhan, Peoples R China

[6] Sinosoft Co Ltd, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022 | 2022年

基金：

国家重点研发计划;

关键词：

Review Comments; Code Review; Text Generation; Machine Learning;

D O I：

10.1145/3540250.3549099

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Code review is one of the best practices as a powerful safeguard for software quality. In practice, senior or highly skilled reviewers inspect source code and provide constructive comments, considering what authors may ignore, for example, some special cases. The collaborative validation between contributors results in code being highly qualified and less chance of bugs. However, since personal knowledge is limited and varies, the efficiency and effectiveness of code review practice are worthy of further improvement. In fact, it still takes a colossal and time-consuming effort to deliver useful review comments. This paper explores a synergy of multiple practical review comments to enhance code review and proposes AUGER(AUtomatically GEnerating Review comments): a review comments generator with pre-training models. We first collect empirical review data from 11 notable Java projects and construct a dataset of 10,882 code changes. By leveraging Text-to-Text Transfer Transformer (T5) models, the framework synthesizes valuable knowledge in the training stage and effectively outperforms baselines by 37.38% in ROUGE-L. 29% of our automatic review comments are considered useful according to prior studies. The inference generates just in 20 seconds and is also open to training further. Moreover, the performance also gets improved when thoroughly analyzed in case study.

引用

页码：1009 / 1021

页数：13

共 50 条

[1] AUGER: Automatically Generating Review Comments with Pre-training Models
Li, Lingwei
Yang, Li
Jiang, Huaxi
Yan, Jun
Luo, Tiejian
Hua, Zihan
Liang, Geng
Zuo, Chun
[J]. arXiv, 2022,
[2] Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Liu, Tongtong
Feng, Fangxiang
Wang, Xiaojie
[J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2556 - 2565
[3] Realistic Channel Models Pre-training
Huangfu, Yourui
Wang, Jian
Xu, Chen
Li, Rong
Ge, Yiqun
Wang, Xianbin
Zhang, Huazi
Wang, Jun
[J]. 2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2019,
[4] Pre-training Mention Representations in Coreference Models
Varkel, Yuval
Globerson, Amir
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 8534 - 8540
[5] A Method of Relation Extraction Using Pre-training Models
Wang, Yu
Sun, Yining
Ma, Zuchang
Gao, Lisheng
Xu, Yang
Wu, Yichen
[J]. 2020 13TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2020), 2020, : 176 - 179
[6] Improving the Sample Efficiency of Pre-training Language Models
Berend, Gabor
[J]. ERCIM NEWS, 2024, (136): : 38 - 40
[7] Pre-Training Clustering Models to Summarize Vietnamese Texts
Nguyen, Ti-Hon
Do, Thanh-Nghi
[J]. VIETNAM JOURNAL OF COMPUTER SCIENCE, 2024,
[8] Pre-training and diagnosing knowledge base completion models
Kocijan, Vid
Jang, Myeongjun
Lukasiewicz, Thomas
[J]. ARTIFICIAL INTELLIGENCE, 2024, 329
[9] Pre-training with Diffusion Models for Dental Radiography Segmentation
Rousseau, Jeremy
Alaka, Christian
Covili, Emma
Mayard, Hippolyte
Misrachi, Laura
Au, Willy
[J]. DEEP GENERATIVE MODELS, DGM4MICCAI 2023, 2024, 14533 : 174 - 182
[10] Method for Automatically Generating Online Comments
Liu, Xinran
Xu, Yabin
Li, Jixian
[J]. Data Analysis and Knowledge Discovery, 2023, 7 (04) : 101 - 113

← 1 2 3 4 5 →