How are functionally similar code clones syntactically different? An empirical study and a benchmark

被引:12
|
作者
Wagner, Stefan [1 ]
Abdulkhaleq, Asim [1 ]
Bogicevic, Ivan [1 ]
Ostberg, Jan-Peter [1 ]
Ramadani, Jasmin [1 ]
机构
[1] Univ Stuttgart, Inst Software Technol, Stuttgart, Germany
来源
关键词
Code clone; Functionally similar clone; Empirical study; Benchmark;
D O I
10.7717/peerj-cs.49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Background. Today, redundancy in source code, so-called "clones'' caused by copy &paste can be found reliably using clone detection tools. Redundancy can arise also independently, however, not caused by copy&paste. At present, it is not clear how only functionally similar clones (FSC) differ from clones created by copy&paste. Our aim is to understand and categorise the syntactical differences in FSCs that distinguish them from copy&paste clones in a way that helps clone detection research. Methods. We conducted an experiment using known functionally similar programs in Java and C from coding contests. We analysed syntactic similarity with traditional detection tools and explored whether concolic clone detection can go beyond syntax. We ran all tools on 2,800 programs and manually categorised the differences in a random sample of 70 program pairs. Results. We found no FSCs where complete files were syntactically similar. We could detect a syntactic similarity in a part of the files in <16% of the program pairs. Concolic detection found 1 of the FSCs. The differences between program pairs were in the categories algorithm, data structure, OO design, I/O and libraries. We selected 58 pairs for an openly accessible benchmark representing these categories. Discussion. The majority of differences between functionally similar clones are beyond the capabilities of current clone detection approaches. Yet, our benchmark can help to drive further clone detection research.
引用
收藏
页数:26
相关论文
共 41 条
  • [1] Are There Functionally Similar Code Clones in Practice?
    Kaefer, Verena
    Wagner, Stefan
    Koschke, Rainer
    2018 IEEE 12TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC), 2018, : 2 - 8
  • [2] Bug Replication in Code Clones: An Empirical Study
    Islam, Judith F.
    Mondal, Manishankar
    Roy, Chanchal K.
    2016 IEEE 23RD INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER), VOL 1, 2016, : 68 - 78
  • [3] An empirical study on the maintenance of source code clones
    Suresh Thummalapenta
    Luigi Cerulo
    Lerina Aversano
    Massimiliano Di Penta
    Empirical Software Engineering, 2010, 15 : 1 - 34
  • [4] An empirical study on the maintenance of source code clones
    Thummalapenta, Suresh
    Cerulo, Luigi
    Aversano, Lerina
    Di Penta, Massimiliano
    EMPIRICAL SOFTWARE ENGINEERING, 2010, 15 (01) : 1 - 34
  • [5] How clones are maintained: An empirical study
    Aversano, Lerina
    Cerulo, Luigi
    Di Penta, Massimiliano
    CSMR 2007: 11TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING, PROCEEDINGS: SOFWARE EVOLUTION IN COMPLEX SOFTWARE INTENSIVE SYSTEMS, 2007, : 81 - +
  • [6] An Empirical Study of Long-Lived Code Clones
    Cai, Dongxiang
    Kim, Miryung
    FUNDAMENTAL APPROACHES TO SOFTWARE ENGINEERING, 2011, 6603 : 432 - +
  • [7] An empirical study of code clones: Density, entropy, and patterns
    Hu, Bin
    Yu, Dongjin
    Wu, Yijian
    Hu, Tianyi
    Cai, Yuanfang
    SCIENCE OF COMPUTER PROGRAMMING, 2025, 242
  • [8] An empirical study on inconsistent changes to code clones at the release level
    Bettenburg, Nicolas
    Shang, Weiyi
    Ibrahim, Walid M.
    Adams, Bram
    Zou, Ying
    Hassan, Ahmed E.
    SCIENCE OF COMPUTER PROGRAMMING, 2012, 77 (06) : 760 - 776
  • [9] An Empirical Study on Accidental Cross-Project Code Clones
    Pyl, Mitchel
    van Bladel, Brent
    Demeyer, Serge
    PROCEEDINGS OF THE 2020 IEEE 14TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC '20), 2020, : 33 - 37
  • [10] An Empirical Study on Inconsistent Changes to Code Clones at Release Level
    Bettenburg, Nicolas
    Shang, Weyi
    Ibrahim, Walid
    Adams, Bram
    Zou, Ying
    Hassan, Ahmed E.
    16TH WORKING CONFERENCE ON REVERSE ENGINEERING (WCRE 2009), 2009, : 85 - 94