Similarity-Based Source Code Vulnerability Detection Leveraging Transformer Architecture: Harnessing Cross- Attention for Hierarchical Analysis

被引:0
|
作者
Han, Sungmin [1 ]
Kim, Miju [1 ]
Kang, Jaesik [2 ]
Kim, Kwangsoo [2 ]
Lee, Seungwoon [2 ]
Lee, Sangkyun [1 ]
机构
[1] Korea Univ, Sch Cybersecur, Seoul 02841, South Korea
[2] LIG Nex1, Cyber Warfare Res & Dev Lab, Seongnam Si 13488, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Source coding; Codes; Transformers; Contrastive learning; Computer architecture; Data models; Deep learning; Computational modeling; Security; Software reliability; Code similarity; contrastive learning; cross-attention; source code vulnerability detection; transformer;
D O I
10.1109/ACCESS.2024.3474857
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The growing complexity and volume of modern software have led to an increase in source code vulnerabilities, posing significant security risks. In response, deep learning-based automated source code vulnerability detection methods, particularly those utilizing source code similarity analysis, have recently emerged as promising solutions. However, existing similarity-based source code vulnerability detection methods frequently fail to fully utilize information from the hierarchical structure of source code and are often computationally expensive, limiting their practicality in real-world scenarios. In this paper, we introduce XTransformer, a novel deep learning-based source code vulnerability detector tailored for comparing target source code against archived vulnerable codes across various levels of the source code's hierarchical structure by leveraging extra cross-attention imposed on the transformer architecture. Additionally, we propose a specialized training strategy based on supervised contrastive learning to improve XTransformer's ability to effectively learn and differentiate between vulnerable and non-vulnerable source codes. Comprehensive experiments demonstrate that XTransformer outperforms current state-of-the-art methods across different datasets and code lengths while significantly reducing the inference time compared to other similarity-based methods that utilize hierarchical information from source code.
引用
收藏
页码:150295 / 150307
页数:13
相关论文
共 13 条
  • [1] CRABS-former: CRoss-Architecture Binary Code Similarity Detection based on Transformer
    Feng, Yuhong
    Li, Haoran
    Cao, Yixuan
    Wang, Yufeng
    Feng, Haiyue
    PROCEEDINGS OF THE 15TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2024, 2024, : 11 - 20
  • [2] A cosine similarity-based labeling technique for vulnerability type detection using source codes
    Ozturk, M. Maruf
    COMPUTERS & SECURITY, 2024, 146
  • [3] VDHGT: A Source Code Vulnerability Detection Method Based on Heterogeneous Graph Transformer
    Yang, Hongyu
    Yang, Haiyun
    Zhang, Liang
    CYBERSPACE SAFETY AND SECURITY, CSS 2022, 2022, 13547 : 217 - 224
  • [4] VulPecker: An Automated Vulnerability Detection System Based on Code Similarity Analysis
    Li, Zhen
    Zou, Deqing
    Xu, Shouhuai
    Jin, Hai
    Qi, Hanchao
    Hu, Jie
    32ND ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2016), 2016, : 201 - 213
  • [5] DVul-WLG: Graph Embedding Network Based on Code Similarity for Cross-Architecture Firmware Vulnerability Detection
    Sun, Hao
    Tong, Yanjun
    Zhao, Jing
    Gu, Zhaoquan
    INFORMATION SECURITY (ISC 2021), 2021, 13118 : 320 - 337
  • [6] Flowchart-Based Cross-Language Source Code Similarity Detection
    Zhang, Feng
    Li, Guofan
    Liu, Cong
    Song, Qian
    SCIENTIFIC PROGRAMMING, 2020, 2020
  • [7] CBSDI: Cross-Architecture Binary Code Similarity Detection based on Index Table
    Deng, Longmin
    Zhao, Dongdong
    Zhou, Junwei
    Xia, Zhe
    Xiang, Jianwen
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 527 - 536
  • [8] A vulnerability detection algorithm based on residual graph attention networks for source code imbalance (RGAN)
    Tang, Mingwei
    Tang, Wei
    Gui, Qingchi
    Hu, Jie
    Zhao, Mingfeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [9] MCANet: Hierarchical cross-fusion lightweight transformer based on multi-ConvHead attention for object detection
    Zhao, Zuopeng
    Hao, Kai
    Liu, Xiaofeng
    Zheng, Tianci
    Xu, Junjie
    Cui, Shuya
    He, Chen
    Zhou, Jie
    Zhao, Guangming
    IMAGE AND VISION COMPUTING, 2023, 136
  • [10] Optir-SBERT: Cross-Architecture Binary Code Similarity Detection Based on Optimized LLVM IR
    Yan, Yintong
    Yu, Lu
    Wang, Taiyan
    Li, Yuwei
    Pan, Zulie
    DIGITAL FORENSICS AND CYBER CRIME, PT 2, ICDF2C 2023, 2024, 571 : 95 - 113