GraphBinMatch: Graph-based Similarity Learning for Cross-Language Binary and Source Code Matching

被引:0
|
作者
TehraniJamsaz, Ali [1 ]
Chen, Hanze [1 ]
Jannesari, Ali [1 ]
机构
[1] Iowa State Univ, Ames, IA 50011 USA
基金
美国国家科学基金会;
关键词
cross-language; code similarity; binary-source matching;
D O I
10.1109/IPDPSW63119.2024.00103
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Matching binary to source code and vice versa has various applications in different fields, such as computer security, software engineering, and reverse engineering. Even though there exist methods that try to match source code with binary code to accelerate the reverse engineering process, most of them arc designed to focus on one programming language. However, in real life, programs are developed using different programming languages depending on their requirements. Thus, cross-language binary-to-source code matching has recently gained more attention. Nonetheless, the existing approaches still stniggle to have precise predictions due to the inherent difficulties when the problem of matching binary code and source code needs to be addressed across programming languages. In this paper, we address the problem of cross-language binary source code matching. We propose GraphBinMatch, an approach based on a graph neural network that learns the similarity between binary and source codes. We evaluate GraphBinMatch on several tasks, such as cross-language binary-to-source code matching and cross-language source-to-source matching We also evaluate the performance of our approach on single-language binary-to-source code matching. Experimental results show that GraphBinMatch significantly outperforms stale-of-the-art, with improvements as high as 15% over the Fl score.
引用
收藏
页码:506 / 515
页数:10
相关论文
共 50 条
  • [41] Image Patch-Matching With Graph-Based Learning in Street Scenes
    She, Rui
    Kang, Qiyu
    Wang, Sijie
    Tay, Wee Peng
    Guan, Yong Liang
    Navarro, Diego Navarro
    Hartmannsgruber, Andreas
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3465 - 3480
  • [42] Graph-Based Relation-Aware Representation Learning for Clothing Matching
    Li, Yang
    Luo, Yadan
    Huang, Zi
    DATABASES THEORY AND APPLICATIONS, ADC 2020, 2020, 12008 : 189 - 197
  • [43] Cross-Language Source Code Re-Use Detection Using Latent Semantic Analysis
    Flores, Enrique
    Barron-Cedeno, Alberto
    Moreno, Lidia
    Rosso, Paolo
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2015, 21 (13) : 1708 - 1725
  • [44] BUGGRAPH: Differentiating Source-Binary Code Similarity with Graph Triplet-Loss Network
    Ji, Yuede
    Cui, Lei
    Huang, H. Howie
    ASIA CCS'21: PROCEEDINGS OF THE 2021 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 702 - 715
  • [45] Cross-Modality Binary Code Learning via Fusion Similarity Hashing
    Liu, Hong
    Ji, Rongrong
    Wu, Yongjian
    Huang, Feiyue
    Zhang, Baochang
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6345 - 6353
  • [46] MaGnn: Binary-Source Code Matching by Modality-Sharing Graph Convolution for Binary Provenance Analysis
    Ou, Weihan
    Ding, Steven H. H.
    2023 IEEE 47TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC, 2023, : 658 - 666
  • [47] Graph-based Multi-view Binary Learning for image clustering
    Jiang, Guangqi
    Wang, Huibing
    Peng, Jinjia
    Chen, Dongyan
    Fu, Xianping
    NEUROCOMPUTING, 2021, 427 : 225 - 237
  • [48] Practical Binary Code Similarity Detection with BERT-based Transferable Similarity Learning
    Ahn, Sunwoo
    Ahn, Seonggwan
    Koo, Hyungjoon
    Paek, Yunheung
    PROCEEDINGS OF THE 38TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, ACSAC 2022, 2022, : 361 - 374
  • [49] Cross-language interactions during novel word learning: The contribution of form similarity and participant characteristics
    Elias, Mariana
    Degani, Tamar
    BILINGUALISM-LANGUAGE AND COGNITION, 2022, 25 (04) : 548 - 565
  • [50] Learning the Marshallese Phonological System: The Role of Cross-language Similarity on the Perception and Production of Secondary Articulations
    Sturman, Heather Willson
    Baker-Smemoe, Wendy
    Carreno, Sofia
    Miller, Bradley B.
    LANGUAGE AND SPEECH, 2016, 59 (04) : 462 - 487