Detecting Code Clones in Binary Executables

被引:0
|
作者
Saebjornsen, Andreas [1 ]
Willcock, Jeremiah
Panas, Thomas
Quinlan, Daniel
Su, Zhendong [1 ]
机构
[1] Univ Calif Davis, Davis, CA 95616 USA
关键词
software tools; clone detection; binary analysis; ALGORITHMS;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Large software projects contain significant code duplication, mainly due to copying and pasting code. Many techniques have been developed to identify duplicated code to enable applications such as refactoring, detecting bugs, and protecting intellectual property. Because source code is often unavailable, especially for third-party software, finding duplicated code in binaries becomes particularly important. However, existing techniques operate primarily on source code, and no effective tool exists for binaries. In this paper, we describe the first practical clone detection algorithm for binary executables. Our algorithm extends an existing tree similarity framework based on clustering of characteristic vectors of labeled trees with novel techniques to normalize assembly instructions and to accurately and compactly model their structural information. We have implemented our technique and evaluated it on Windows XP system binaries totaling over 50 million assembly instructions. Results show that it is both scalable and precise: it analyzed Windows XP system binaries in a few hours and produced few false positives. We believe our technique is a practical, enabling technology for many applications dealing with binary code.
引用
收藏
页码:117 / 127
页数:11
相关论文
共 50 条
  • [31] Detecting Patching of Executables without System Calls
    Banescu, Sebastian
    Ahmadvand, Mohsen
    Pretschner, Alexander
    Shield, Robert
    Hamilton, Chris
    PROCEEDINGS OF THE SEVENTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY (CODASPY'17), 2017, : 185 - 196
  • [32] ANTIFUZZ: Impeding Fuzzing Audits of Binary Executables
    Gueler, Emre
    Aschermann, Cornelius
    Abbasi, Ali
    Holz, Thorsten
    PROCEEDINGS OF THE 28TH USENIX SECURITY SYMPOSIUM, 2019, : 1931 - 1947
  • [33] Mostly Static Program Partitioning of Binary Executables
    Yardimci, Efe
    Franz, Michael
    ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2009, 31 (05):
  • [34] Detecting Forensically Relevant Information From PE Executables
    Jophin, Shany
    Vijayan, Meera
    Dija, S.
    2013 INTERNATIONAL CONFERENCE ON RECENT TRENDS IN INFORMATION TECHNOLOGY (ICRTIT), 2013, : 277 - 282
  • [35] CodeGuard: enhancing accuracy in detecting clones within java']java source code
    Glani, Yasir
    Ping, Luo
    FRONTIERS IN COMPUTER SCIENCE, 2024, 6
  • [36] Detecting Java']Java Code Clones with Multi-Granularities Based on Bytecode
    Yu, Dongjin
    Wang, Jie
    Wu, Qing
    Yang, Jiazha
    Wang, Jiaojiao
    Yang, Wei
    Yan, Wei
    2017 IEEE 41ST ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2017, : 317 - 326
  • [37] A novel code representation for detecting Java']Java code clones using high-level and abstract compiled code representations
    Quradaa, Fahmi H.
    Shahzad, Sara
    Saeed, Rashad
    Sufyan, Mubarak M.
    PLOS ONE, 2024, 19 (05):
  • [38] A comparative study of test code clones and production code clones?
    van Bladel, Brent
    Demeyer, Serge
    JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 176
  • [39] Efficient Features for Function Matching between Binary Executables
    Karamitas, Chariton
    Kehagias, Athanasios
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2018), 2018, : 335 - 345
  • [40] A Static Birthmark of Windows Binary Executables based on Strings
    Kim, Yesol
    Moon, Jeongoh
    Kim, Dongjin
    Jeong, Younsik
    Cho, Seong-Je
    Park, Minkyu
    Han, Sangchul
    2013 SEVENTH INTERNATIONAL CONFERENCE ON INNOVATIVE MOBILE AND INTERNET SERVICES IN UBIQUITOUS COMPUTING (IMIS 2013), 2013, : 734 - 738