Detecting Code Clones in Binary Executables

被引:0
|
作者
Saebjornsen, Andreas [1 ]
Willcock, Jeremiah
Panas, Thomas
Quinlan, Daniel
Su, Zhendong [1 ]
机构
[1] Univ Calif Davis, Davis, CA 95616 USA
关键词
software tools; clone detection; binary analysis; ALGORITHMS;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Large software projects contain significant code duplication, mainly due to copying and pasting code. Many techniques have been developed to identify duplicated code to enable applications such as refactoring, detecting bugs, and protecting intellectual property. Because source code is often unavailable, especially for third-party software, finding duplicated code in binaries becomes particularly important. However, existing techniques operate primarily on source code, and no effective tool exists for binaries. In this paper, we describe the first practical clone detection algorithm for binary executables. Our algorithm extends an existing tree similarity framework based on clustering of characteristic vectors of labeled trees with novel techniques to normalize assembly instructions and to accurately and compactly model their structural information. We have implemented our technique and evaluated it on Windows XP system binaries totaling over 50 million assembly instructions. Results show that it is both scalable and precise: it analyzed Windows XP system binaries in a few hours and produced few false positives. We believe our technique is a practical, enabling technology for many applications dealing with binary code.
引用
收藏
页码:117 / 127
页数:11
相关论文
共 50 条
  • [21] Combining Holistic Source Code Representation with Siamese Neural Networks for Detecting Code Clones
    Patel, Smit
    Sinha, Roopak
    TESTING SOFTWARE AND SYSTEMS, ICTSS 2021, 2022, 13045 : 148 - 159
  • [22] Method for Detecting Unknown Malicious Executables
    Rozenberg, Boris
    Gudes, Ehud
    Elovici, Yuval
    Fledel, Yuval
    RECENT ADVANCES IN INTRUSION DETECTION, PROCEEDINGS, 2009, 5758 : 376 - 377
  • [23] Method of Searching for Clones of the Program Code in Binary Executive Files
    E. V. Zavadskii
    A. V. Bulat
    N. A. Gribkov
    Automatic Control and Computer Sciences, 2024, 58 (8) : 1263 - 1270
  • [24] Interprocedural static slicing of binary executables
    Kiss, A
    Jász, J
    Lehotai, G
    Gyimóthy, T
    THIRD IEEE INTERNATIONAL WORKSHOP ON SOURCE CODE ANALYSIS AND MANIPULATION - PROCEEDINGS, 2003, : 118 - 127
  • [25] Tracelet-Based Code Search in Executables
    David, Yaniv
    Yahav, Eran
    ACM SIGPLAN NOTICES, 2014, 49 (06) : 349 - 360
  • [26] Renovo: A Hidden Code Extractor for Packed Executables
    Kang, Min Gyung
    Poosankam, Pongsin
    Yin, Heng
    WORM'07: PROCEEDINGS OF THE 2007 ACM WORKSHOP ON RECURRING MALCODE, 2007, : 46 - 53
  • [27] Statically detecting use after free on binary code
    Feist, Josselin
    Mounier, Laurent
    Potet, Marie-Laure
    JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES, 2014, 10 (03): : 211 - 217
  • [28] A Novel Approach for Detecting Type-IV Clones in Test Code
    van Bladel, Brent
    Demeyer, Serge
    2019 IEEE 13TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC '19), 2019, : 8 - 12
  • [29] Detecting Java']Java Code Clones Based on Bytecode Sequence Alignment
    Yu, Dongjin
    Yang, Jiazha
    Chen, Xin
    Chen, Jie
    IEEE ACCESS, 2019, 7 : 22421 - 22433
  • [30] A static API birthmark for Windows binary executables
    Choi, Seokwoo
    Park, Heewan
    Lim, Hyun-il
    Han, Taisook
    JOURNAL OF SYSTEMS AND SOFTWARE, 2009, 82 (05) : 862 - 873