Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools

被引:92
|
作者
Tan, Shin Hwei [1 ]
Yi, Jooyong [2 ]
Yulis [1 ]
Mechtaev, Sergey [1 ]
Roychoudhury, Abhik [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Innopolis Univ, Innopolis, Russia
关键词
automated program repair; defect classes; empirical evaluation; benchmark;
D O I
10.1109/ICSE-C.2017.76
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Several automated program repair techniques have been proposed to reduce the time and effort spent in bug-fixing. While these repair tools are designed to be generic such that they could address many software faults, different repair tools may fix certain types of faults more effectively than other tools. Therefore, it is important to compare more objectively the effectiveness of different repair tools on various fault types. However, existing benchmarks on automated program repairs do not allow thorough investigation of the relationship between fault types and the effectiveness of repair tools. We present Codeflaws, a set of 3902 defects from 7436 programs automatically classified across 39 defect classes (we refer to different types of fault as defect classes derived from the syntactic differences between a buggy program and a patched program).
引用
收藏
页码:180 / 182
页数:3
相关论文
共 50 条
  • [31] Towards a benchmark for evaluating design pattern miner tools
    Fueloep, Lajos Jeno
    Ferenc, Rudolf
    Gyimothy, Tibor
    CSMR 2008: 12TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING: DEVELOPING EVOLVABLE SYSTEMS, 2008, : 143 - 152
  • [32] On the acceptance by code reviewers of candidate security patches suggested by Automated Program Repair tools
    Papotti, Aurora
    Paramitha, Ranindya
    Massacci, Fabio
    EMPIRICAL SOFTWARE ENGINEERING, 2024, 29 (05)
  • [33] SandhiKosh: A Benchmark Corpus for Evaluating Sanskrit Sandhi Tools
    Bhardwaj, Shubham
    Gantayat, Neelamadhav
    Chaturvedi, Nikhil
    Garg, Rahul
    Agarwal, Sumeet
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4494 - 4500
  • [34] Generating synthetic benchmark circuits for evaluating CAD tools
    Stroobandt, D
    Verplaetse, P
    Van Campenhout, J
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2000, 19 (09) : 1011 - 1022
  • [35] Evaluating Automated Software Verification Tools
    Prause, Christian R.
    Gerlich, Rainer
    Gerlich, Ralf
    2018 IEEE 11TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST), 2018, : 343 - 353
  • [36] Evaluating Feedback Tools in Introductory Programming Classes
    Reis, Ruan
    Soares, Gustavo
    Mongiovi, Melina
    Andrade, Wilkerson L.
    2019 IEEE FRONTIERS IN EDUCATION CONFERENCE (FIE 2019), 2019,
  • [37] Program Repair and Trusted Automatic Programming
    Roychoudhury, Abhik
    PROCEEDINGS OF THE 17TH INNOVATIONS IN SOFTWARE ENGINEERING CONFERENCE, ISEC 2024, 2024,
  • [38] A comprehensive study of automatic program repair on the QuixBugs benchmark
    Ye, He
    Martinez, Matias
    Durieux, Thomas
    Monperrus, Martin
    JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 171
  • [39] A Comprehensive Study of Automatic Program Repair on the QuixBugs Benchmark
    Ye, He
    Martinez, Matias
    Durieux, Thomas
    Monperrus, Martin
    2019 IEEE 1ST INTERNATIONAL WORKSHOP ON INTELLIGENT BUG FIXING (IBF '19), 2019, : 1 - 10
  • [40] Attention Please: Consider Mockito when Evaluating Newly Proposed Automated Program Repair Techniques
    Wang, Shangwen
    Wen, Ming
    Mao, Xiaoguang
    Yang, Deheng
    PROCEEDINGS OF EASE 2019 - EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING, 2019, : 260 - 266