Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools

被引：92

作者：

Tan, Shin Hwei ^{[1
]}

Yi, Jooyong ^{[2
]}

Yulis ^{[1
]}

Mechtaev, Sergey ^{[1
]}

Roychoudhury, Abhik ^{[1
]}

机构：

[1] Natl Univ Singapore, Singapore, Singapore

[2] Innopolis Univ, Innopolis, Russia

来源：

PROCEEDINGS OF THE 2017 IEEE/ACM 39TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING COMPANION (ICSE-C 2017) | 2017年

关键词：

automated program repair; defect classes; empirical evaluation; benchmark;

D O I：

10.1109/ICSE-C.2017.76

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Several automated program repair techniques have been proposed to reduce the time and effort spent in bug-fixing. While these repair tools are designed to be generic such that they could address many software faults, different repair tools may fix certain types of faults more effectively than other tools. Therefore, it is important to compare more objectively the effectiveness of different repair tools on various fault types. However, existing benchmarks on automated program repairs do not allow thorough investigation of the relationship between fault types and the effectiveness of repair tools. We present Codeflaws, a set of 3902 defects from 7436 programs automatically classified across 39 defect classes (we refer to different types of fault as defect classes derived from the syntactic differences between a buggy program and a patched program).

引用

页码：180 / 182

页数：3

共 50 条

[31] Towards a benchmark for evaluating design pattern miner tools
Fueloep, Lajos Jeno
Ferenc, Rudolf
Gyimothy, Tibor
CSMR 2008: 12TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING: DEVELOPING EVOLVABLE SYSTEMS, 2008, : 143 - 152
[32] On the acceptance by code reviewers of candidate security patches suggested by Automated Program Repair tools
Papotti, Aurora
Paramitha, Ranindya
Massacci, Fabio
EMPIRICAL SOFTWARE ENGINEERING, 2024, 29 (05)
[33] SandhiKosh: A Benchmark Corpus for Evaluating Sanskrit Sandhi Tools
Bhardwaj, Shubham
Gantayat, Neelamadhav
Chaturvedi, Nikhil
Garg, Rahul
Agarwal, Sumeet
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4494 - 4500
[34] Generating synthetic benchmark circuits for evaluating CAD tools
Stroobandt, D
Verplaetse, P
Van Campenhout, J
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2000, 19 (09) : 1011 - 1022
[35] Evaluating Automated Software Verification Tools
Prause, Christian R.
Gerlich, Rainer
Gerlich, Ralf
2018 IEEE 11TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST), 2018, : 343 - 353
[36] Evaluating Feedback Tools in Introductory Programming Classes
Reis, Ruan
Soares, Gustavo
Mongiovi, Melina
Andrade, Wilkerson L.
2019 IEEE FRONTIERS IN EDUCATION CONFERENCE (FIE 2019), 2019,
[37] Program Repair and Trusted Automatic Programming
Roychoudhury, Abhik
PROCEEDINGS OF THE 17TH INNOVATIONS IN SOFTWARE ENGINEERING CONFERENCE, ISEC 2024, 2024,
[38] A comprehensive study of automatic program repair on the QuixBugs benchmark
Ye, He
Martinez, Matias
Durieux, Thomas
Monperrus, Martin
JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 171
[39] A Comprehensive Study of Automatic Program Repair on the QuixBugs Benchmark
Ye, He
Martinez, Matias
Durieux, Thomas
Monperrus, Martin
2019 IEEE 1ST INTERNATIONAL WORKSHOP ON INTELLIGENT BUG FIXING (IBF '19), 2019, : 1 - 10
[40] Attention Please: Consider Mockito when Evaluating Newly Proposed Automated Program Repair Techniques
Wang, Shangwen
Wen, Ming
Mao, Xiaoguang
Yang, Deheng
PROCEEDINGS OF EASE 2019 - EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING, 2019, : 260 - 266

← 1 2 3 4 5 →