DeKeDVer: A deep learning-based multi-type software vulnerability classification framework using vulnerability description and source code

被引:4
|
作者
Dong, Yukun [1 ]
Tang, Yeer [1 ]
Cheng, Xiaotong [1 ]
Yang, Yufei [1 ]
机构
[1] China Univ Petr East China, Qingdao Inst Software, Coll Comp Sci & Technol, Qingdao, Peoples R China
关键词
Multi-type vulnerability classification; Vulnerability description; Source code; Text Recurrent Convolutional Neural Network; Relational graph attention network;
D O I
10.1016/j.infsof.2023.107290
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Software vulnerabilities have confused software developers for a long time. Vulnerability classification is thus crucial, through which we can know the specific type of vulnerability and then conduct targeted repair. Stack of papers have looked into deep learning-based multi-type vulnerability classification, among which most are based on vulnerability descriptions and some are based on source code. While vulnerability descriptions can sometimes mislead vulnerability classification and source code-based approaches have been rarely explored in multi-type vulnerability classification. Objective: We design DeKeDVer (Vulnerability Descriptions and Key Domain based Vulnerability Classifier) with two objectives: (i) to extract more useful information from vulnerability descriptions; (ii) to better utilize the information source code can reflect. Method: In this work, we propose a multi-type vulnerability classifier which combine vulnerability descriptions and source code together. We process vulnerability descriptions and source code of each project separately. For the vulnerability description of a sample, we preprocess it using a specified way we design based on our observations on numerous descriptions and then select text features. After that, Text Recurrent Convolutional Neural Network (TextRCNN) is applied to learn text information. For source code, we leverage its Code Property Graph (CPG) and extract key domain from it which are then embedded. Acquired feature vectors are then fed into Relational Graph Attention Network (RGAT). Result vectors gained from TextRCNN and RGAT are combined together as the feature vector of the current sample. A Multi-Layer Perceptron (MLP) layer is further added to undertake classification. Results: We conduct our experiments on C/C++ projects from NVD. Experimental results show that our work achieves 84.49% in weighted F1-measure which proves our work to be more effective. Conclusion: Our work utilizes information reflected both from vulnerability descriptions and source code to facilitate vulnerability classification and achieves higher weighted F1-measure than existing vulnerability classification tools.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] A deep learning-based approach for software vulnerability detection using code metrics
    Subhan, Fazli
    Wu, Xiaoxue
    Bo, Lili
    Sun, Xiaobing
    Rahman, Muhammad
    [J]. IET SOFTWARE, 2022, 16 (05) : 516 - 526
  • [2] mVulSniffer: a multi-type source code vulnerability sniffer method
    Zhang, Xuejun
    Zhang, Fenghe
    Gai, Jiyang
    Du, Xiaogang
    Zhou, Wenjie
    Cai, Teli
    Zhao, Bo
    [J]. Tongxin Xuebao/Journal on Communications, 2023, 44 (09): : 149 - 160
  • [3] Literature survey of deep learning-based vulnerability analysis on source code
    Semasaba, Abubakar Omari Abdallah
    Zheng, Wei
    Wu, Xiaoxue
    Agyemang, Samuel Akwasi
    [J]. IET SOFTWARE, 2020, 14 (06) : 654 - 664
  • [4] An Empirical Study on Vulnerability Detection for Source Code Software based on Deep Learning
    Lin, Wei
    Cai, Saihua
    [J]. 2021 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C 2021), 2021, : 1159 - 1160
  • [5] Research and Progress on Learning-Based Source Code Vulnerability Detection
    Su, Xiao-Hong
    Zheng, Wei-Ning
    Jiang, Yuan
    Wei, Hong-Wei
    Wan, Jia-Yuan
    Wei, Zi-Yue
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (02): : 337 - 374
  • [6] An empirical evaluation of deep learning-based source code vulnerability detection: Representation versus models
    Semasaba, Abubakar Omari Abdallah
    Zheng, Wei
    Wu, Xiaoxue
    Agyemang, Samuel Akwasi
    Liu, Tao
    Ge, Yuan
    [J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2023, 35 (11)
  • [7] Interpretation of Learning-Based Automatic Source Code Vulnerability Detection Model Using LIME
    Tang, Gaigai
    Zhang, Long
    Yang, Feng
    Meng, Lianxiao
    Cao, Weipeng
    Qiu, Meikang
    Ren, Shuangyin
    Yang, Lin
    Wang, Huiqiang
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT III, 2021, 12817 : 275 - 286
  • [8] Toward More Effective Deep Learning-based Automated Software Vulnerability Prediction, Classification, and Repair
    Fu, Michael
    [J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 208 - 212
  • [9] Automated Vulnerability Detection in Source Code Using Deep Representation Learning
    Russell, Rebecca L.
    Kim, Louis
    Hamilton, Lei H.
    Lazovich, Tomo
    Harer, Jacob A.
    Ozdemir, Onur
    Ellingwood, Paul M.
    McConley, Marc W.
    [J]. 2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 757 - 762
  • [10] Learning to Predict Severity of Software Vulnerability Using Only Vulnerability Description
    Han, Zhuobing
    Li, Xiaohong
    Xing, Zhenchang
    Liu, Hongtao
    Feng, Zhiyong
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2017, : 125 - 136