VulExplainer: A Transformer-Based Hierarchical Distillation for Explaining Vulnerability Types

被引:12
|
作者
Fu, Michael [1 ]
Nguyen, Van [1 ]
Tantithamthavorn, Chakkrit [1 ]
Le, Trung [1 ]
Phung, Dinh [1 ]
机构
[1] Monash Univ, Fac Informat Technol, Melbourne, Australia
关键词
Software vulnerability; software security; CLASSIFICATION;
D O I
10.1109/TSE.2023.3305244
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Deep learning-based vulnerability prediction approaches are proposed to help under-resourced security practitioners to detect vulnerable functions. However, security practitioners still do not know what type of vulnerabilities correspond to a given prediction (aka CWE-ID). Thus, a novel approach to explain the type of vulnerabilities for a given prediction is imperative. In this paper, we propose VulExplainer, an approach to explain the type of vulnerabilities. We represent VulExplainer as a vulnerability classification task. However, vulnerabilities have diverse characteristics (i.e., CWE-IDs) and the number of labeled samples in each CWE-ID is highly imbalanced (known as a highly imbalanced multi-class classification problem), which often lead to inaccurate predictions. Thus, we introduce a Transformer-based hierarchical distillation for software vulnerability classification in order to address the highly imbalanced types of software vulnerabilities. Specifically, we split a complex label distribution into sub-distributions based on CWE abstract types (i.e., categorizations that group similar CWE-IDs). Thus, similar CWE-IDs can be grouped and each group will have a more balanced label distribution. We learn TextCNN teachers on each of the simplified distributions respectively, however, they only perform well in their group. Thus, we build a transformer student model to generalize the performance of TextCNN teachers through our hierarchical knowledge distillation framework. Through an extensive evaluation using the real-world 8,636 vulnerabilities, our approach outperforms all of the baselines by 5%-29%. The results also demonstrate that our approach can be applied to Transformer-based architectures such as CodeBERT, GraphCodeBERT, and CodeGPT. Moreover, our method maintains compatibility with any Transformer-based model without requiring any architectural modifications but only adds a special distillation token to the input. These results highlight our significant contributions towards the fundamental and practical problem of explaining software vulnerability.
引用
收藏
页码:4550 / 4565
页数:16
相关论文
共 50 条
  • [31] Transformer-Based Learned Optimization
    Gartner, Erik
    Metz, Luke
    Andriluka, Mykhaylo
    Freeman, C. Daniel
    Sminchisescu, Cristian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11970 - 11979
  • [32] Transformer-based Image Compression
    Lu, Ming
    Guo, Peiyao
    Shi, Huiqing
    Cao, Chuntong
    Ma, Zhan
    DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 469 - 469
  • [33] Transformer-Based Microbubble Localization
    Gharamaleki, Sepideh K.
    Helfield, Brandon
    Rivaz, Hassan
    2022 IEEE INTERNATIONAL ULTRASONICS SYMPOSIUM (IEEE IUS), 2022,
  • [34] Transformer-based hierarchical latent space VAE for interpretable remaining useful life prediction
    Jing, Tao
    Zheng, Pai
    Xia, Liqiao
    Liu, Tianyuan
    ADVANCED ENGINEERING INFORMATICS, 2022, 54
  • [35] HRPE: Hierarchical Relative Positional Encoding for Transformer-Based Structured Symbolic Music Generation
    Li, Pengfei
    Wu, Jingcheng
    Ji, Zihao
    MUSIC INTELLIGENCE, SOMI 2023, 2024, 2007 : 122 - 134
  • [36] HTML']HTML: Hierarchical Transformer-based Multi-task Learning for Volatility Prediction
    Yang, Linyi
    Ng, Tin Lok James
    Smyth, Barry
    Dong, Riuhai
    WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 441 - 451
  • [37] Error Types in Transformer-Based Paraphrasing Models: A Taxonomy, Paraphrase Annotation Model and Dataset
    Berro, Auday
    Benatallah, Boualem
    Gaci, Yacine
    Benabdeslem, Khalid
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT I, ECML PKDD 2024, 2024, 14941 : 332 - 349
  • [38] A transformer-based low-resolution face recognition method via on-and-offline knowledge distillation
    Song, Yaozhe
    Tang, Hongying
    Meng, Fangzhou
    Wang, Chaoyi
    Wu, Mengmeng
    Shu, Ziting
    Tong, Guanjun
    NEUROCOMPUTING, 2022, 509 : 193 - 205
  • [39] DeNKD: Decoupled Non-Target Knowledge Distillation for Complementing Transformer-Based Unsupervised Domain Adaptation
    Mei, Zhen
    Ye, Peng
    Li, Baopu
    Chen, Tao
    Fan, Jiayuan
    Ouyang, Wanli
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3220 - 3231
  • [40] Breast cancer diagnosis through knowledge distillation of Swin transformer-based teacher-student models
    Kolla, Bhavannarayanna
    Venugopal, P.
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2023, 4 (04):