DeKeDVer: A deep learning-based multi-type software vulnerability classification framework using vulnerability description and source code

被引：4

作者：

Dong, Yukun ^{[1
]}

Tang, Yeer ^{[1
]}

Cheng, Xiaotong ^{[1
]}

Yang, Yufei ^{[1
]}

机构：

[1] China Univ Petr East China, Qingdao Inst Software, Coll Comp Sci & Technol, Qingdao, Peoples R China

来源：

INFORMATION AND SOFTWARE TECHNOLOGY | 2023年 / 163卷

关键词：

Multi-type vulnerability classification; Vulnerability description; Source code; Text Recurrent Convolutional Neural Network; Relational graph attention network;

D O I：

10.1016/j.infsof.2023.107290

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Context: Software vulnerabilities have confused software developers for a long time. Vulnerability classification is thus crucial, through which we can know the specific type of vulnerability and then conduct targeted repair. Stack of papers have looked into deep learning-based multi-type vulnerability classification, among which most are based on vulnerability descriptions and some are based on source code. While vulnerability descriptions can sometimes mislead vulnerability classification and source code-based approaches have been rarely explored in multi-type vulnerability classification. Objective: We design DeKeDVer (Vulnerability Descriptions and Key Domain based Vulnerability Classifier) with two objectives: (i) to extract more useful information from vulnerability descriptions; (ii) to better utilize the information source code can reflect. Method: In this work, we propose a multi-type vulnerability classifier which combine vulnerability descriptions and source code together. We process vulnerability descriptions and source code of each project separately. For the vulnerability description of a sample, we preprocess it using a specified way we design based on our observations on numerous descriptions and then select text features. After that, Text Recurrent Convolutional Neural Network (TextRCNN) is applied to learn text information. For source code, we leverage its Code Property Graph (CPG) and extract key domain from it which are then embedded. Acquired feature vectors are then fed into Relational Graph Attention Network (RGAT). Result vectors gained from TextRCNN and RGAT are combined together as the feature vector of the current sample. A Multi-Layer Perceptron (MLP) layer is further added to undertake classification. Results: We conduct our experiments on C/C++ projects from NVD. Experimental results show that our work achieves 84.49% in weighted F1-measure which proves our work to be more effective. Conclusion: Our work utilizes information reflected both from vulnerability descriptions and source code to facilitate vulnerability classification and achieves higher weighted F1-measure than existing vulnerability classification tools.

引用

页数：14

共 50 条

[1] A deep learning-based approach for software vulnerability detection using code metrics
Subhan, Fazli
Wu, Xiaoxue
Bo, Lili
Sun, Xiaobing
Rahman, Muhammad
[J]. IET SOFTWARE, 2022, 16 (05) : 516 - 526
[2] mVulSniffer: a multi-type source code vulnerability sniffer method
Zhang, Xuejun
Zhang, Fenghe
Gai, Jiyang
Du, Xiaogang
Zhou, Wenjie
Cai, Teli
Zhao, Bo
[J]. Tongxin Xuebao/Journal on Communications, 2023, 44 (09): : 149 - 160
[3] Literature survey of deep learning-based vulnerability analysis on source code
Semasaba, Abubakar Omari Abdallah
Zheng, Wei
Wu, Xiaoxue
Agyemang, Samuel Akwasi
[J]. IET SOFTWARE, 2020, 14 (06) : 654 - 664
[4] An Empirical Study on Vulnerability Detection for Source Code Software based on Deep Learning
Lin, Wei
Cai, Saihua
[J]. 2021 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C 2021), 2021, : 1159 - 1160
[5] Research and Progress on Learning-Based Source Code Vulnerability Detection
Su, Xiao-Hong
Zheng, Wei-Ning
Jiang, Yuan
Wei, Hong-Wei
Wan, Jia-Yuan
Wei, Zi-Yue
[J]. Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (02): : 337 - 374
[6] An empirical evaluation of deep learning-based source code vulnerability detection: Representation versus models
Semasaba, Abubakar Omari Abdallah
Zheng, Wei
Wu, Xiaoxue
Agyemang, Samuel Akwasi
Liu, Tao
Ge, Yuan
[J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2023, 35 (11)
[7] Interpretation of Learning-Based Automatic Source Code Vulnerability Detection Model Using LIME
Tang, Gaigai
Zhang, Long
Yang, Feng
Meng, Lianxiao
Cao, Weipeng
Qiu, Meikang
Ren, Shuangyin
Yang, Lin
Wang, Huiqiang
[J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT III, 2021, 12817 : 275 - 286
[8] Toward More Effective Deep Learning-based Automated Software Vulnerability Prediction, Classification, and Repair
Fu, Michael
[J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 208 - 212
[9] Automated Vulnerability Detection in Source Code Using Deep Representation Learning
Russell, Rebecca L.
Kim, Louis
Hamilton, Lei H.
Lazovich, Tomo
Harer, Jacob A.
Ozdemir, Onur
Ellingwood, Paul M.
McConley, Marc W.
[J]. 2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 757 - 762
[10] Learning to Predict Severity of Software Vulnerability Using Only Vulnerability Description
Han, Zhuobing
Li, Xiaohong
Xing, Zhenchang
Liu, Hongtao
Feng, Zhiyong
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2017, : 125 - 136

← 1 2 3 4 5 →