GraphBinMatch: Graph-based Similarity Learning for Cross-Language Binary and Source Code Matching

被引:0
|
作者
TehraniJamsaz, Ali [1 ]
Chen, Hanze [1 ]
Jannesari, Ali [1 ]
机构
[1] Iowa State Univ, Ames, IA 50011 USA
基金
美国国家科学基金会;
关键词
cross-language; code similarity; binary-source matching;
D O I
10.1109/IPDPSW63119.2024.00103
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Matching binary to source code and vice versa has various applications in different fields, such as computer security, software engineering, and reverse engineering. Even though there exist methods that try to match source code with binary code to accelerate the reverse engineering process, most of them arc designed to focus on one programming language. However, in real life, programs are developed using different programming languages depending on their requirements. Thus, cross-language binary-to-source code matching has recently gained more attention. Nonetheless, the existing approaches still stniggle to have precise predictions due to the inherent difficulties when the problem of matching binary code and source code needs to be addressed across programming languages. In this paper, we address the problem of cross-language binary source code matching. We propose GraphBinMatch, an approach based on a graph neural network that learns the similarity between binary and source codes. We evaluate GraphBinMatch on several tasks, such as cross-language binary-to-source code matching and cross-language source-to-source matching We also evaluate the performance of our approach on single-language binary-to-source code matching. Experimental results show that GraphBinMatch significantly outperforms stale-of-the-art, with improvements as high as 15% over the Fl score.
引用
收藏
页码:506 / 515
页数:10
相关论文
共 50 条
  • [31] TF-IDF-INSPIRED DETECTION FOR CROSS-LANGUAGE SOURCE CODE PLAGIARISM AND COLLUSION
    Karnalim, Oscar
    COMPUTER SCIENCE-AGH, 2020, 21 (01): : 113 - 136
  • [32] Child-Adult Differences in Second-Language Phonological Learning: The Role of Cross-Language Similarity
    Baker, Wendy
    Trofimovich, Pavel
    Flege, James E.
    Mack, Molly
    Halter, Randall
    LANGUAGE AND SPEECH, 2008, 51 : 317 - 342
  • [33] Investigating Graph Embedding Methods for Cross-Platform Binary Code Similarity Detection
    Cochard, Victor
    Pfammatter, Damian
    Duong, Chi Thang
    Humbert, Mathias
    2022 IEEE 7TH EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2022), 2022, : 60 - 73
  • [34] A Similarity- Based Cross-Language Comparison of Basicness and Demarcation of "Blue" Terms
    Bimler, David
    Uuskula, Mari
    COLOR RESEARCH AND APPLICATION, 2017, 42 (03): : 362 - 377
  • [35] Calculation of Chinese-Thai Cross-Language Similarity Based on Sentence Embedding
    Feng Yinhan
    Zhan Gang
    Mao Weixiu
    Lin Shunbao
    Yu Shijie
    Zhang Kui
    2020 5TH INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA 2020), 2020, : 268 - 271
  • [36] Term similarity-based query expansion for cross-language information retrieval
    Adriani, M
    van Rijsbergen, CJ
    RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, PROCEEDINGS, 1999, 1696 : 311 - 322
  • [37] A graph-based approach to corner matching using mutual information as a local similarity measure
    Lourakis, MIA
    Argyros, AA
    Marias, K
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 827 - 830
  • [38] Building Bridges in Computer Networks: A Nifty Assignment for Cross-Language Learning and Code Refactoring
    Akhmetov, Ildar
    Schmidt, Logan W.
    PROCEEDINGS OF THE 26TH WESTERN CANADIAN CONFERENCE ON COMPUTING EDUCATION, WCCCE 2024, 2024,
  • [39] Graph-Based Shape Analysis for Heterogeneous Geometric Datasets: Similarity, Retrieval and Substructure Matching
    Chen, Jiangce
    Ilies, Horea T.
    Ding, Caiwen
    COMPUTER-AIDED DESIGN, 2022, 143
  • [40] Measuring 3D shape similarity by graph-based matching of the medial scaffolds
    Chang, Ming-Ching
    Kimia, Benjamin B.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2011, 115 (05) : 707 - 720