DeepVulSeeker: A novel vulnerability identification framework via code graph structure and pre-training mechanism

被引:4
|
作者
Wang, Jin [1 ]
Xiao, Hui [1 ]
Zhong, Shuwen [1 ]
Xiao, Yinhao [1 ]
机构
[1] Guangdong Univ Finance & Econ, Sch Informat Sci, Guangzhou 510320, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Vulnerability identification; Software security; Neural network; Pre; -training; Vulnerability pattern; Code feature; CLONE DETECTION;
D O I
10.1016/j.future.2023.05.016
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Software vulnerabilities can pose severe harms to a computing system. They can lead to system crash, privacy leakage, or even physical damage. Correctly identifying vulnerabilities among enormous software codes in a timely manner is so far the essential prerequisite to patch them. Unfortunately, the current vulnerability identification methods, either the classic ones or the deep-learning-based ones, have several critical drawbacks, making them unable to meet the present-day demands put forward by the software industry. To overcome the drawbacks, in this paper, we propose DeepVulSeeker, a novel fully automated vulnerability identification framework, which leverages both code graph structures and the semantic features with the help of the recently advanced Graph Representation Self-Attention and pre-training mechanisms. Our experiments show that DeepVulSeeker not only reaches an accuracy as high as 0.99 on traditional CWE datasets, but also outperforms all other existing methods on two highly-complicated datasets. We also testified DeepVulSeeker based on three case studies, and found that DeepVulSeeker is able to understand the implications of the vulnerabilities. We have fully implemented DeepVulSeeker and open-sourced it for future follow-up research. & COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:15 / 26
页数:12
相关论文
共 29 条
  • [1] Graph Strength for Identification of Pre-training Desynchronization
    Zapata Castano, Frank Yesid
    Gomez Morales, Oscar Wladimir
    Alvarez Meza, Andres Marino
    Castellanos Dominguez, Cesar German
    [J]. INTELLIGENT TECHNOLOGIES: DESIGN AND APPLICATIONS FOR SOCIETY, CITIS 2022, 2023, 607 : 36 - 44
  • [2] VulBERTa: Simplified Source Code Pre-Training for Vulnerability Detection
    Hanif, Hazim
    Maffeis, Sergio
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [3] An Adaptive Graph Pre-training Framework for Localized Collaborative Filtering
    Wang, Yiqi
    Li, Chaozhuo
    Liu, Zheng
    Li, Mingzheng
    Tang, Jiliang
    Xie, Xing
    Chen, Lei
    Yu, Philip S.
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (02)
  • [4] Dynamic Scene Graph Generation via Anticipatory Pre-training
    Li, Yiming
    Yang, Xiaoshan
    Xu, Changsheng
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13864 - 13873
  • [5] Cognize Yourself: Graph Pre-Training via Core Graph Cognizing and Differentiating
    Yu, Tao
    Fu, Yao
    Hu, Linghui
    Wang, Huizhao
    Jiang, Weihao
    Pu, Shiliang
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 2413 - 2422
  • [6] Graph Contrastive Multi-view Learning: A Pre-training Framework for Graph Classification
    Adjeisah, Michael
    Zhu, Xinzhong
    Xu, Huiying
    Ayall, Tewodros Alemu
    [J]. Knowledge-Based Systems, 2024, 299
  • [7] Graph Structure Enhanced Pre-Training Language Model for Knowledge Graph Completion
    Zhu, Huashi
    Xu, Dexuan
    Huang, Yu
    Jin, Zhi
    Ding, Weiping
    Tong, Jiahui
    Chong, Guoshuang
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2697 - 2708
  • [8] Pre-training Code Representation with Semantic Flow Graph for Effective Bug Localization
    Du, Yali
    Yu, Zhongxing
    [J]. PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 579 - 591
  • [9] Dictionary Temporal Graph Network via Pre-training Embedding Distillation
    Liu, Yipeng
    Zheng, Fang
    [J]. ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VI, ICIC 2024, 2024, 14880 : 336 - 347
  • [10] Improving Knowledge Graph Representation Learning by Structure Contextual Pre-training
    Ye, Ganqiang
    Zhang, Wen
    Bi, Zhen
    Wong, Chi Man
    Chen, Hui
    Chen, Huajun
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE GRAPHS (IJCKG 2021), 2021, : 151 - 155