AUGER: Automatically Generating Review Comments with Pre-training Models

被引:11
|
作者
Li, Lingwei [1 ]
Yang, Li [2 ]
Jiang, Huaxi [1 ]
Yan, Jun [3 ]
Luo, Tiejian [4 ]
Hua, Zihan [5 ]
Liang, Geng [2 ]
Zuo, Chun [6 ]
机构
[1] Univ Chinese Acad Sci, CAS, Inst Software, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Software, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, CAS, Inst Software, State Key Lab Comp Sci, Beijing, Peoples R China
[4] Univ Chinese Acad Sci, Beijing, Peoples R China
[5] Univ Chinese Acad Sci, Wuhan Univ, Wuhan, Peoples R China
[6] Sinosoft Co Ltd, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
Review Comments; Code Review; Text Generation; Machine Learning;
D O I
10.1145/3540250.3549099
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code review is one of the best practices as a powerful safeguard for software quality. In practice, senior or highly skilled reviewers inspect source code and provide constructive comments, considering what authors may ignore, for example, some special cases. The collaborative validation between contributors results in code being highly qualified and less chance of bugs. However, since personal knowledge is limited and varies, the efficiency and effectiveness of code review practice are worthy of further improvement. In fact, it still takes a colossal and time-consuming effort to deliver useful review comments. This paper explores a synergy of multiple practical review comments to enhance code review and proposes AUGER(AUtomatically GEnerating Review comments): a review comments generator with pre-training models. We first collect empirical review data from 11 notable Java projects and construct a dataset of 10,882 code changes. By leveraging Text-to-Text Transfer Transformer (T5) models, the framework synthesizes valuable knowledge in the training stage and effectively outperforms baselines by 37.38% in ROUGE-L. 29% of our automatic review comments are considered useful according to prior studies. The inference generates just in 20 seconds and is also open to training further. Moreover, the performance also gets improved when thoroughly analyzed in case study.
引用
收藏
页码:1009 / 1021
页数:13
相关论文
共 50 条
  • [1] AUGER: Automatically Generating Review Comments with Pre-training Models
    Li, Lingwei
    Yang, Li
    Jiang, Huaxi
    Yan, Jun
    Luo, Tiejian
    Hua, Zihan
    Liang, Geng
    Zuo, Chun
    [J]. arXiv, 2022,
  • [2] Multi-stage Pre-training over Simplified Multimodal Pre-training Models
    Liu, Tongtong
    Feng, Fangxiang
    Wang, Xiaojie
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2556 - 2565
  • [3] Realistic Channel Models Pre-training
    Huangfu, Yourui
    Wang, Jian
    Xu, Chen
    Li, Rong
    Ge, Yiqun
    Wang, Xianbin
    Zhang, Huazi
    Wang, Jun
    [J]. 2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2019,
  • [4] Pre-training Mention Representations in Coreference Models
    Varkel, Yuval
    Globerson, Amir
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 8534 - 8540
  • [5] A Method of Relation Extraction Using Pre-training Models
    Wang, Yu
    Sun, Yining
    Ma, Zuchang
    Gao, Lisheng
    Xu, Yang
    Wu, Yichen
    [J]. 2020 13TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2020), 2020, : 176 - 179
  • [6] Improving the Sample Efficiency of Pre-training Language Models
    Berend, Gabor
    [J]. ERCIM NEWS, 2024, (136): : 38 - 40
  • [7] Pre-Training Clustering Models to Summarize Vietnamese Texts
    Nguyen, Ti-Hon
    Do, Thanh-Nghi
    [J]. VIETNAM JOURNAL OF COMPUTER SCIENCE, 2024,
  • [8] Pre-training and diagnosing knowledge base completion models
    Kocijan, Vid
    Jang, Myeongjun
    Lukasiewicz, Thomas
    [J]. ARTIFICIAL INTELLIGENCE, 2024, 329
  • [9] Pre-training with Diffusion Models for Dental Radiography Segmentation
    Rousseau, Jeremy
    Alaka, Christian
    Covili, Emma
    Mayard, Hippolyte
    Misrachi, Laura
    Au, Willy
    [J]. DEEP GENERATIVE MODELS, DGM4MICCAI 2023, 2024, 14533 : 174 - 182
  • [10] Method for Automatically Generating Online Comments
    Liu, Xinran
    Xu, Yabin
    Li, Jixian
    [J]. Data Analysis and Knowledge Discovery, 2023, 7 (04) : 101 - 113