Gapped Code Clone Detection with Lightweight Source Code Analysis

被引:0
|
作者
Murakami, Hiroaki [1 ]
Hotta, Keisuke [1 ]
Higo, Yoshiki [1 ]
Igaki, Hiroshi [1 ]
Kusumoto, Shinji [1 ]
机构
[1] Osaka Univ, Grad Sch Informat Sci & Technol, Suita, Osaka 5650871, Japan
关键词
Code Clone; Program Analysis; Software Maintenance; Tool Comparison; SYSTEM;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A variety of methods detecting code clones has been proposed before. In order to detect gapped code clones, AST-based technique, PDG-based technique, metric-based technique and text-based technique using the LCS algorithm have been proposed. However, each of those techniques has limitations. For example, existing AST-based techniques and PDG-based techniques require costs for transforming source files into intermediate representations such as ASTs or PDGs and comparing them. Existing metric-based techniques and text-based techniques using the LCS algorithm cannot detect code clones if methods or blocks are partially duplicated. This paper proposes a new method that detects gapped code clones using the Smith-Waterman algorithm to resolve those limitations. The Smith-Waterman algorithm is an algorithm for identifying similar alignments between two sequences even if they include some gaps. The authors developed the proposed method as a software tool named CDSW, and confirmed that the proposed method could resolve the limitations by conducting a quantitative evaluation with Bellon's benchmark.
引用
收藏
页码:93 / 102
页数:10
相关论文
共 50 条
  • [1] Indexing source code and clone detection
    Tronicek, Zdenek
    INFORMATION AND SOFTWARE TECHNOLOGY, 2022, 144
  • [2] Clone detection in source code by frequent itemset techniques
    Wahler, V
    Seipel, D
    Von Gudenberg, JW
    Fischer, G
    FOURTH IEEE INTERNATIONAL WORKSHOP ON SOURCE CODE ANALYSIS AND MANIPULATION, PROCEEDINGS, 2004, : 128 - 135
  • [3] Source Code Clone Detection Using Unsupervised Similarity Measures
    Martinez-Gil, Jorge
    SOFTWARE QUALITY AS A FOUNDATION FOR SECURITY, SWQD 2024, 2024, 505 : 21 - 37
  • [4] Semantic Clone Detection: Can Source Code Comments Help?
    Ghosh, Akash
    Kuttal, Sandeep Kaur
    2018 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN-CENTRIC COMPUTING (VL/HCC), 2018, : 315 - 317
  • [5] STUBBER: Compiling Source Code into Bytecode without Dependencies for Java']Java Code Clone Detection
    Schafer, Andre
    Amme, Wolfram
    Heinze, Thomas S.
    2021 IEEE 15TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES, IWSC 2021, 2021, : 29 - 35
  • [6] Deep Learning Code Fragments for Code Clone Detection
    White, Martin
    Tufano, Michele
    Vendome, Christopher
    Poshyvanyk, Denys
    2016 31ST IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2016, : 87 - 98
  • [7] Refactoring Code Clone Detection
    Othman, Zhala Sarkawt
    Kaya, Mehmet
    2019 7TH INTERNATIONAL SYMPOSIUM ON DIGITAL FORENSICS AND SECURITY (ISDFS), 2019,
  • [8] Transferring Code-Clone Detection and Analysis to Practice
    Dang, Yingnong
    Zhang, Dongmei
    Ge, Song
    Huang, Ray
    Chu, Chengyun
    Xie, Tao
    2017 IEEE/ACM 39TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE TRACK (ICSE-SEIP 2017), 2017, : 53 - 62
  • [9] Code Clone Detection Model: A SWOT Analysis Perspective
    Mubarak-Ali, Al-Fahim
    Romli, Rahiwan Nazar
    Sjarif, Nilam Nur Amir
    ADVANCED SCIENCE LETTERS, 2018, 24 (10) : 7210 - 7213
  • [10] CCFinder: A multilinguistic token-based code clone detection system for large scale source code
    Kamiya, T
    Kusumoto, S
    Inoue, K
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2002, 28 (07) : 654 - 670