Fine-grained Commit-level Vulnerability Type Prediction by CWE Tree Structure

被引:10
|
作者
Pan, Shengyi [1 ]
Bao, Lingfeng [1 ]
Xia, Xin [2 ]
Lo, David [3 ]
Li, Shanping [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] Huawei, Shenzhen, Peoples R China
[3] Singapore Management Univ, Sch Informat Syst, Singapore, Singapore
基金
美国国家科学基金会; 新加坡国家研究基金会;
关键词
Software Security; Vulnerability Type; CWE; CLASSIFICATION;
D O I
10.1109/ICSE48619.2023.00088
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Identifying security patches via code commits to allow early warnings and timely fixes for Open Source Software (OSS) has received increasing attention. However, the existing detection methods can only identify the presence of a patch (i.e., a binary classification) but fail to pinpoint the vulnerability type. In this work, we take the first step to categorize the security patches into fine-grained vulnerability types. Specifically, we use the Common Weakness Enumeration (CWE) as the label and perform fine-grained classification using categories at the third level of the CWE tree. We first formulate the task as a Hierarchical Multi-label Classification (HMC) problem, i.e., inferring a path (a sequence of CWE nodes) from the root of the CWE tree to the node at the target depth. We then propose an approach named TREEVUL with a hierarchical and chained architecture, which manages to utilize the structure information of the CWE tree as prior knowledge of the classification task. We further propose a tree structure aware and beam search based inference algorithm for retrieving the optimal path with the highest merged probability. We collect a large security patch dataset from NVD, consisting of 6,541 commits from 1,560 GitHub OSS repositories. Experimental results show that TREEVUL significantly outperforms the best performing baselines, with improvements of 5.9%, 25.0%, and 7.7% in terms of weighted F1-score, macro F1-score, and MCC, respectively. We further conduct a user study and a case study to verify the practical value of TREEVUL in enriching the binary patch detection results and improving the data quality of NVD, respectively.
引用
收藏
页码:957 / 969
页数:13
相关论文
共 50 条
  • [21] A fine-grained taxonomy of security vulnerability in active network environments
    Yang, JS
    Han, YJ
    Kim, DS
    Chang, BH
    Chung, TM
    Na, JC
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2004, PT 1, 2004, 3043 : 693 - 700
  • [22] Fine-grained parallel RNA secondary structure prediction using SCFGs on FPGA
    Xia F.
    Dou Y.
    Song J.
    Lei G.-Q.
    Jisuanji Xuebao/Chinese Journal of Computers, 2010, 33 (05): : 797 - 812
  • [23] Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA
    Fei Xia
    Yong Dou
    Xingming Zhou
    Xuejun Yang
    Jiaqing Xu
    Yang Zhang
    BMC Bioinformatics, 10
  • [24] FPGA-oriented fine-grained algorithm for RNA secondary structure prediction
    Xu, Lin
    Li, Xiao-Min
    Tan, Guang-Ming
    Liu, Xin-Chun
    Bu, Dong-Bo
    Feng, Sheng-Zhong
    Sun, Ning-Hui
    Jisuanji Xuebao/Chinese Journal of Computers, 2006, 29 (02): : 233 - 238
  • [25] Binary Code Vulnerability Location Identification with Fine-grained Slicing
    Cui, Ningning
    Chen, Liwei
    Shi, Gang
    2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 502 - 506
  • [26] A fine-grained taxonomy of security vulnerability in active network environments
    Yang, JS
    Han, YJ
    Kim, DS
    Chang, BH
    Chung, TM
    Na, JC
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2004, PT 4, 2004, 3046 : 681 - 688
  • [27] Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA
    Xia, Fei
    Dou, Yong
    Zhou, Xingming
    Yang, Xuejun
    Xu, Jiaqing
    Zhang, Yang
    BMC BIOINFORMATICS, 2009, 10
  • [28] Fine-grained parallel RNA secondary structure prediction using SCFGs on FPGA
    Xia, Fei
    Dou, Yong
    Zhou, Dan
    Li, Xin
    PARALLEL COMPUTING, 2010, 36 (09) : 516 - 530
  • [29] Fine-Grained Scene Graph Generation via Sample-Level Bias Prediction
    Li, Yansheng
    Wang, Tingzhu
    Wu, Kang
    Wang, Linlin
    Guo, Xin
    Wang, Wenbin
    COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 : 18 - 35
  • [30] Enhancing Bug-Inducing Commit Identification: A Fine-Grained Semantic Analysis Approach
    Tang, Lingxiao
    Ni, Chao
    Huang, Qiao
    Bao, Lingfeng
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2024, 50 (11) : 3037 - 3052