Benchmarks for Software Clone Detection: A Ten-Year Retrospective

被引:0
|
作者
Roy, Chanchal K. [1 ]
Cordy, James R. [2 ]
机构
[1] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK, Canada
[2] Queens Univ, Sch Comp, Kingston, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
CODE CLONES;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
There have been a great many methods and tools proposed for software clone detection. While some work has been done on assessing and comparing performance of these tools, very little empirical evaluation has been done. In particular, accuracy measures such as precision and recall have only been roughly estimated, due both to problems in creating a validated clone benchmark against which tools can be compared, and to the manual effort required to hand check large numbers of candidate clones. In order to cope with this issue, over the last 10 years we have been working towards building cloning benchmarks for objectively evaluating clone detection tools. Beginning with our WCRE 2008 paper, where we conducted a modestly large empirical study with the NiCad clone detection tool, over the past ten years we have extended and grown our work to include several languages, much larger datasets, and model clones in languages such as Simulink. From a modest set of 15 C and Java systems comprising a total of 7 million lines in 2008, our work has progressed to a benchmark called BigCloneBench with eight million manually validated clone pairs in a large inter-project source dataset of more than 25,000 projects and 365 million lines of code. In this paper, we present a history and overview of software clone detection benchmarks, and review the steps of ourselves and others to come to this stage. We outline a future for clone detection benchmarks and hope to encourage researchers to both use existing benchmarks and to contribute to building the benchmarks of the future.
引用
收藏
页码:26 / 37
页数:12
相关论文
共 50 条
  • [41] The ten-year nap
    Gibbs, Beth
    LIBRARY JOURNAL, 2008, 133 (04) : 76 - 77
  • [42] A Ten-Year Plan
    Hennebicq, Leon
    ANNALS OF THE AMERICAN ACADEMY OF POLITICAL AND SOCIAL SCIENCE, 1932, 162 : 67 - 80
  • [43] A Ten-Year Retrospective Analysis of Cement Burns in a Tertiary Burns Center
    Alexander, William
    Coghlan, Patrick
    Greenwood, John
    JOURNAL OF BURN CARE & RESEARCH, 2014, 35 (01): : 80 - 83
  • [44] Bacteria-Associated Granulomatous Mastitis: A Ten-Year Retrospective Review
    Shoyele, Olubunmi
    Das, Debasmita
    Vidhun, Ramapriya
    Dodge, Jessica
    Cheng, Zandra
    Sieber, Steven
    LABORATORY INVESTIGATION, 2018, 98 : 105 - 106
  • [45] Infectious endophthalmitis at a Philippine tertiary hospital: a ten-year retrospective study
    Victoria Grace Dimacali
    Ruben Lim Bon Siong
    Journal of Ophthalmic Inflammation and Infection, 10
  • [46] Suicide: A ten-year retrospective review of Kentucky medical examiner cases
    Shields, LBE
    Hunsaker, DM
    Hunsaker, JC
    JOURNAL OF FORENSIC SCIENCES, 2005, 50 (03) : 613 - 617
  • [47] Acute Pancreatitis in Pregnancy: A Ten-Year Noninterventional, Retrospective Cohort Experience
    Haiyan, Zhao
    Na, Peng
    Jialin, He
    Qingjian, Lv
    Jianying, Bai
    Xiumei, Bai
    GASTROENTEROLOGY RESEARCH AND PRACTICE, 2022, 2022
  • [48] Granulomatous Mastitis: A Two-Institution Ten-Year Retrospective Review
    Baker, Gabrielle
    MODERN PATHOLOGY, 2017, 30 : 30A - 31A
  • [49] Ten-year retrospective clinicohistological study of cutaneous lupus erythematosus in Korea
    Oh, Eui Hyun
    Kim, Eun Jin
    Ro, Young Suck
    Ko, Joo Yeon
    JOURNAL OF DERMATOLOGY, 2018, 45 (04): : 436 - 443
  • [50] Extraversion and compatibilist intuitions: a ten-year retrospective and meta-analyses
    Feltz, Adam
    Cokely, Edward
    PHILOSOPHICAL PSYCHOLOGY, 2019, 32 (03) : 388 - 403