Source Code Clone Detection Using Unsupervised Similarity Measures

被引:0
|
作者
Martinez-Gil, Jorge [1 ]
机构
[1] Software Competence Ctr Hagenberg GmbH, Softwarepk 32a, A-4232 Hagenberg, Austria
关键词
Software Engineering; Clone Detection; Similarity Measures; Code Similarity; METRICS; GRAPH;
D O I
10.1007/978-3-031-56281-5_2
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Assessing similarity in source code has gained significant attention in recent years due to its importance in software engineering tasks such as clone detection and code search and recommendation. This work presents a comparative analysis of unsupervised similarity measures for identifying source code clone detection. The goal is to overview the current state-of-the-art techniques, their strengths, and weaknesses. To do that, we compile the existing unsupervised strategies and evaluate their performance on a benchmark dataset to guide software engineers in selecting appropriate methods for their specific use cases. The source code of this study is available at https://github.com/jorge-martinez-gil/codesim
引用
收藏
页码:21 / 37
页数:17
相关论文
共 50 条
  • [31] Semantic Similarity Search for Source Code Plagiarism Detection: An Exploratory Study
    Ebrahim, Fahad
    Joy, Mike
    [J]. PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024, 2024, : 360 - 366
  • [32] Exploring the Similarity/Dissimilarity Measures for Unsupervised IDS
    Murty, P. Sita Rama
    Kumar, R. Kiran
    Sailaja, M.
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON DATA MINING AND ADVANCED COMPUTING (SAPIENCE), 2016, : 220 - 224
  • [33] Calibration of source-code similarity detection tools for objective comparisons
    Novak, M.
    Kermek, D.
    Joy, M.
    [J]. 2018 41ST INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2018, : 794 - 799
  • [34] Identifying Source Code Reuse across Repositories using LCS-based Source Code Similarity
    Kawamitsu, Naohiro
    Ishio, Takashi
    Kanda, Tetsuya
    Kula, Raula Gaikovina
    De Roover, Coen
    Inoue, Katsuro
    [J]. 2014 14TH IEEE INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM 2014), 2014, : 305 - 314
  • [35] Academic Source Code Plagiarism Detection by Measuring Program Behavioral Similarity
    Cheers, Hayden
    Lin, Yuqing
    Smith, Shamus P.
    [J]. IEEE ACCESS, 2021, 9 : 50391 - 50412
  • [36] Clustering Source Code Elements by Semantic Similarity Using Wikipedia
    Schindler, Mirco
    Fox, Oliver
    Rausch, Andreas
    [J]. 2015 IEEE/ACM FOURTH INTERNATIONAL WORKSHOP ON REALIZING ARTIFICIAL INTELLIGENCE SYNERGIES IN SOFTWARE ENGINEERING (RAISE 2015), 2015, : 13 - 18
  • [37] VGRAPH: A Robust Vulnerable Code Clone Detection System Using Code Property Triplets
    Bowman, Benjamin
    Huang, H. Howie
    [J]. 2020 5TH IEEE EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2020), 2020, : 53 - 69
  • [38] Scalable Source Code Plagiarism Detection Using Source Code Vectors Clustering
    Duracik, Michal
    Krsak, Emil
    Hrkut, Patrik
    [J]. PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, : 499 - 502
  • [39] Code Similarity Detection using AST and Textual Information
    Wen W.
    Xue X.
    Li Y.
    Gu P.
    Xu J.
    [J]. International Journal of Performability Engineering, 2019, 15 (10) : 2683 - 2691
  • [40] Source-code Similarity Detection and Detection Tools Used in Academia: A Systematic Review
    Novak, Matija
    Joy, Mike
    Kermek, Dragutin
    [J]. ACM TRANSACTIONS ON COMPUTING EDUCATION, 2019, 19 (03)