Benchmarks for Software Clone Detection: A Ten-Year Retrospective

被引:0
|
作者
Roy, Chanchal K. [1 ]
Cordy, James R. [2 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK, Canada
[2] Queens Univ, Sch Comp, Kingston, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
CODE CLONES;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
There have been a great many methods and tools proposed for software clone detection. While some work has been done on assessing and comparing performance of these tools, very little empirical evaluation has been done. In particular, accuracy measures such as precision and recall have only been roughly estimated, due both to problems in creating a validated clone benchmark against which tools can be compared, and to the manual effort required to hand check large numbers of candidate clones. In order to cope with this issue, over the last 10 years we have been working towards building cloning benchmarks for objectively evaluating clone detection tools. Beginning with our WCRE 2008 paper, where we conducted a modestly large empirical study with the NiCad clone detection tool, over the past ten years we have extended and grown our work to include several languages, much larger datasets, and model clones in languages such as Simulink. From a modest set of 15 C and Java systems comprising a total of 7 million lines in 2008, our work has progressed to a benchmark called BigCloneBench with eight million manually validated clone pairs in a large inter-project source dataset of more than 25,000 projects and 365 million lines of code. In this paper, we present a history and overview of software clone detection benchmarks, and review the steps of ourselves and others to come to this stage. We outline a future for clone detection benchmarks and hope to encourage researchers to both use existing benchmarks and to contribute to building the benchmarks of the future.
引用
收藏
页码:26 / 37
页数:12
相关论文
共 50 条
  • [1] Critical physiotherapy: a ten-year retrospective
    Nicholls, David A.
    Ahlsen, Birgitte
    Bjorbaekmo, Wenche
    Dahl-Michelsen, Tone
    Hoppner, Heidi
    Rajala, Anna Ilona
    Richter, Robert
    Hansen, Louise Sogaard
    Sudmann, Tobba
    Sviland, Randi
    Maric, Filip
    PHYSIOTHERAPY THEORY AND PRACTICE, 2024, 40 (11) : 2617 - 2629
  • [2] Suicide: A ten-year retrospective study
    Bennett, AT
    Collins, KA
    JOURNAL OF FORENSIC SCIENCES, 2000, 45 (06) : 1256 - 1258
  • [3] An Interview with Tina Feick: A Ten-year Retrospective
    Williams, Michelle
    Parks, Bonnie
    SERIALS REVIEW, 2009, 35 (02) : 98 - 104
  • [4] Defence and peace economics: A ten-year retrospective
    Hartley, K
    Sandler, T
    DEFENCE AND PEACE ECONOMICS, 2000, 11 (01): : 1 - 16
  • [5] Ten-year exploratory retrospective study on empyema
    KM Marmagkiolis
    M Omar
    N Nikolaidis
    T Politis
    I Nikolaidis
    S Fournogerakis
    MP Papamichail
    L Goldstein
    Critical Care, 12 (Suppl 2):
  • [6] Accelerator Architectures-A Ten-Year Retrospective
    Hwu, Wen-mei
    Patel, Sanjay
    IEEE MICRO, 2018, 38 (06) : 56 - 62
  • [7] The Right Metric for Efficient Supercomputing: A Ten-Year Retrospective
    Hsu, Chung-Hsing
    Feng, Wu-chun
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 1090 - 1093
  • [8] Predisposing factors for frostbite - a ten-year retrospective study
    Jovic, Marko
    Jeremic, Jelena
    Jovanovic, Ivan
    Lazarov, Aleksandar
    Stojanovic, Marina
    Jovic, Marija
    Jovanovic, Milan
    SRPSKI ARHIV ZA CELOKUPNO LEKARSTVO, 2019, 147 (9-10) : 583 - 587
  • [9] Evaluation of the effects of explosions: A ten-year retrospective study
    Kaya, Burak
    Ozsoy, Sait
    Balandiz, Hueseyin
    Safali, Mukerrem
    Akyol, Mesut
    ULUSAL TRAVMA VE ACIL CERRAHI DERGISI-TURKISH JOURNAL OF TRAUMA & EMERGENCY SURGERY, 2025, 31 (03): : 233 - 241
  • [10] Heparin-induced thrombocytopenia: A ten-year retrospective
    Warkentin, TE
    ANNUAL REVIEW OF MEDICINE, 1999, 50 : 129 - 147