A Lightweight Cross-Version Binary Code Similarity Detection Based on Similarity and Correlation Coefficient Features

被引:8
|
作者
Guo, Hui [1 ]
Huang, Shuguang [1 ]
Huang, Cheng [2 ]
Zhang, Min [1 ]
Pan, Zulie [1 ]
Shi, Fan [1 ]
Huang, Hui [1 ]
Hu, Donghui [3 ]
Wang, Xiaoping [1 ]
机构
[1] Natl Univ Def Technol, Coll Elect Engn, Hefei 230011, Peoples R China
[2] Sichuan Univ, Coll Cybersecur, Chengdu 610065, Peoples R China
[3] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China
来源
IEEE ACCESS | 2020年 / 8卷
关键词
Binary code similarity detection; cross-version binary; malware detection; similarity coefficient; correlation coefficient;
D O I
10.1109/ACCESS.2020.3004813
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The technique of binary code similarity detection (BCSD) has been applied in many fields, such as malware detection, plagiarism detection and vulnerability search, etc. Existing solutions for the BCSD problem usually compare specific features between binaries based on the control flow graphs of functions from binaries or compute the embedding vector of binary functions and solve the problem based on deep learning algorithms. In this paper, from another research perspective, we propose a new and lightweight method to solve cross-version BCSD problem based on multiple features. It transforms binary functions into vectors and signals and computes the similarity coefficient value and correlation coefficient value for solving cross-version BCSD problem. Without relying on the CFG of functions, deep learning algorithms and other related attributes, our method works directly on the raw bytes of each binary and it can be used as an alternative method to coping with various complex situations that exist in the real-world environment. We implement the method and evaluate it on a custom dataset with about 423,282 samples. The result shows that the method could perform well in cross-version BCSD field, and the recall of our method could reach 96.63%, which is almost the same as the state-of-the-art static solution.
引用
收藏
页码:120501 / 120512
页数:12
相关论文
共 50 条
  • [1] αDiff: Cross-Version Binary Code Similarity Detection with DNN
    Liu, Bingchang
    Huo, Wei
    Zhang, Chao
    Li, Wenchao
    Li, Feng
    Piao, Aihua
    Zou, Wei
    PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18), 2018, : 667 - 678
  • [2] SimCGE: Simple Contrastive Learning of Graph Embeddings for Cross-Version Binary Code Similarity Detection
    Xia, Fengliang
    Wu, Guixing
    Zhao, Guochao
    Li, Xiangyu
    INFORMATION AND COMMUNICATIONS SECURITY, ICICS 2022, 2022, 13407 : 458 - 471
  • [3] Identifying Cross-Version Function Similarity Using Contextual Features
    Black, Paul
    Gondal, Iqbal
    Vamplew, Peter
    Lakhotia, Arun
    2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 810 - 818
  • [4] Binary Code Similarity Detection
    Liu, Zian
    2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, 2021, : 1056 - 1060
  • [5] Practical Binary Code Similarity Detection with BERT-based Transferable Similarity Learning
    Ahn, Sunwoo
    Ahn, Seonggwan
    Koo, Hyungjoon
    Paek, Yunheung
    PROCEEDINGS OF THE 38TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, ACSAC 2022, 2022, : 361 - 374
  • [6] Cross-platform binary code similarity detection based on NMT and graph embedding
    Zhu, Xiaodong
    Jiang, Liehui
    Chen, Zeng
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2021, 18 (04) : 4528 - 4551
  • [7] CBSDI: Cross-Architecture Binary Code Similarity Detection based on Index Table
    Deng, Longmin
    Zhao, Dongdong
    Zhou, Junwei
    Xia, Zhe
    Xiang, Jianwen
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 527 - 536
  • [8] Binary Code Similarity Detection: State and Future
    Li, Zhenshan
    Liu, Hao
    Shan, Ruijie
    Sun, Yanbin
    Jiang, Yu
    Hu, Ning
    2023 IEEE 12TH INTERNATIONAL CONFERENCE ON CLOUD NETWORKING, CLOUDNET, 2023, : 408 - 412
  • [9] A Survey of Binary Code Similarity Detection Techniques
    Ruan, Liting
    Xu, Qizhen
    Zhu, Shunzhi
    Huang, Xujing
    Lin, Xinyang
    ELECTRONICS, 2024, 13 (09)
  • [10] IFAttn: Binary code similarity analysis based on interpretable features with attention
    Jiang, Shuai
    Fu, Cai
    Qian, Yekui
    He, Shuai
    Lv, Jianqiang
    Han, Lansheng
    COMPUTERS & SECURITY, 2022, 120