Fine-grained Commit-level Vulnerability Type Prediction by CWE Tree Structure

被引:10
|
作者
Pan, Shengyi [1 ]
Bao, Lingfeng [1 ]
Xia, Xin [2 ]
Lo, David [3 ]
Li, Shanping [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] Huawei, Shenzhen, Peoples R China
[3] Singapore Management Univ, Sch Informat Syst, Singapore, Singapore
基金
美国国家科学基金会; 新加坡国家研究基金会;
关键词
Software Security; Vulnerability Type; CWE; CLASSIFICATION;
D O I
10.1109/ICSE48619.2023.00088
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Identifying security patches via code commits to allow early warnings and timely fixes for Open Source Software (OSS) has received increasing attention. However, the existing detection methods can only identify the presence of a patch (i.e., a binary classification) but fail to pinpoint the vulnerability type. In this work, we take the first step to categorize the security patches into fine-grained vulnerability types. Specifically, we use the Common Weakness Enumeration (CWE) as the label and perform fine-grained classification using categories at the third level of the CWE tree. We first formulate the task as a Hierarchical Multi-label Classification (HMC) problem, i.e., inferring a path (a sequence of CWE nodes) from the root of the CWE tree to the node at the target depth. We then propose an approach named TREEVUL with a hierarchical and chained architecture, which manages to utilize the structure information of the CWE tree as prior knowledge of the classification task. We further propose a tree structure aware and beam search based inference algorithm for retrieving the optimal path with the highest merged probability. We collect a large security patch dataset from NVD, consisting of 6,541 commits from 1,560 GitHub OSS repositories. Experimental results show that TREEVUL significantly outperforms the best performing baselines, with improvements of 5.9%, 25.0%, and 7.7% in terms of weighted F1-score, macro F1-score, and MCC, respectively. We further conduct a user study and a case study to verify the practical value of TREEVUL in enriching the binary patch detection results and improving the data quality of NVD, respectively.
引用
收藏
页码:957 / 969
页数:13
相关论文
共 50 条
  • [1] Commit-Level, Neural Vulnerability Detection and Assessment
    Li, Yi
    Yadavally, Aashish
    Zhang, Jiaxing
    Wang, Shaohua
    Nguyen, Tien N.
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 1024 - 1036
  • [2] Commit-Level, Neural Vulnerability Detection and Assessment
    Li, Yi
    Yadavally, Aashish
    Zhang, Jiaxing
    Wang, Shaohua
    Nguyen, Tien N.
    ESEC/FSE 2023 - Proceedings of the 31st ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023, : 1024 - 1036
  • [3] A fine-grained study of inpatients who commit suicide
    Busch, KA
    Fawcett, J
    PSYCHIATRIC ANNALS, 2004, 34 (05) : 357 - 364
  • [4] Vulnerability Detection with Fine-Grained Interpretations
    Li, Yi
    Wang, Shaohua
    Nguyen, Tien N.
    PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), 2021, : 292 - 303
  • [5] Fine-grained optimization method for crystal structure prediction
    Terayama, Kei
    Yamashita, Tomoki
    Oguchi, Tamio
    Tsuda, Koji
    NPJ COMPUTATIONAL MATERIALS, 2018, 4
  • [6] Fine-grained optimization method for crystal structure prediction
    Kei Terayama
    Tomoki Yamashita
    Tamio Oguchi
    Koji Tsuda
    npj Computational Materials, 4
  • [7] DeepCVA: Automated Commit-level Vulnerability Assessment with Deep Multi-task Learning
    Triet Huynh Minh Le
    Hin, David
    Croft, Roland
    Babar, M. Ali
    2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, 2021, : 717 - 729
  • [8] Fine-Grained Urban Flow Prediction
    Liang, Yuxuan
    Ouyang, Kun
    Sun, Junkai
    Wang, Yiwei
    Zhang, Junbo
    Zheng, Yu
    Rosenblum, David
    Zimmermann, Roger
    PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 1833 - 1845
  • [9] Fine-Grained Fuel Consumption Prediction
    Fang, Chenguang
    Song, Shaoxu
    Chen, Zhiwei
    Gui, Acan
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2783 - 2791
  • [10] Fine-grained vulnerability detection for medical sensor systems
    Sun, Le
    Wang, Yueyuan
    Li, Huiyun
    Muhammad, Ghulam
    INTERNET OF THINGS, 2024, 28