VulExplainer: A Transformer-Based Hierarchical Distillation for Explaining Vulnerability Types

被引：12

作者：

Fu, Michael ^{[1
]}

Nguyen, Van ^{[1
]}

Tantithamthavorn, Chakkrit ^{[1
]}

Le, Trung ^{[1
]}

Phung, Dinh ^{[1
]}

机构：

[1] Monash Univ, Fac Informat Technol, Melbourne, Australia

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2023年 / 49卷 / 10期

关键词：

Software vulnerability; software security; CLASSIFICATION;

D O I：

10.1109/TSE.2023.3305244

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Deep learning-based vulnerability prediction approaches are proposed to help under-resourced security practitioners to detect vulnerable functions. However, security practitioners still do not know what type of vulnerabilities correspond to a given prediction (aka CWE-ID). Thus, a novel approach to explain the type of vulnerabilities for a given prediction is imperative. In this paper, we propose VulExplainer, an approach to explain the type of vulnerabilities. We represent VulExplainer as a vulnerability classification task. However, vulnerabilities have diverse characteristics (i.e., CWE-IDs) and the number of labeled samples in each CWE-ID is highly imbalanced (known as a highly imbalanced multi-class classification problem), which often lead to inaccurate predictions. Thus, we introduce a Transformer-based hierarchical distillation for software vulnerability classification in order to address the highly imbalanced types of software vulnerabilities. Specifically, we split a complex label distribution into sub-distributions based on CWE abstract types (i.e., categorizations that group similar CWE-IDs). Thus, similar CWE-IDs can be grouped and each group will have a more balanced label distribution. We learn TextCNN teachers on each of the simplified distributions respectively, however, they only perform well in their group. Thus, we build a transformer student model to generalize the performance of TextCNN teachers through our hierarchical knowledge distillation framework. Through an extensive evaluation using the real-world 8,636 vulnerabilities, our approach outperforms all of the baselines by 5%-29%. The results also demonstrate that our approach can be applied to Transformer-based architectures such as CodeBERT, GraphCodeBERT, and CodeGPT. Moreover, our method maintains compatibility with any Transformer-based model without requiring any architectural modifications but only adds a special distillation token to the input. These results highlight our significant contributions towards the fundamental and practical problem of explaining software vulnerability.

引用

页码：4550 / 4565

页数：16

共 50 条

[31] Transformer-Based Learned Optimization
Gartner, Erik
Metz, Luke
Andriluka, Mykhaylo
Freeman, C. Daniel
Sminchisescu, Cristian
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11970 - 11979
[32] Transformer-based Image Compression
Lu, Ming
Guo, Peiyao
Shi, Huiqing
Cao, Chuntong
Ma, Zhan
DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 469 - 469
[33] Transformer-Based Microbubble Localization
Gharamaleki, Sepideh K.
Helfield, Brandon
Rivaz, Hassan
2022 IEEE INTERNATIONAL ULTRASONICS SYMPOSIUM (IEEE IUS), 2022,
[34] Transformer-based hierarchical latent space VAE for interpretable remaining useful life prediction
Jing, Tao
Zheng, Pai
Xia, Liqiao
Liu, Tianyuan
ADVANCED ENGINEERING INFORMATICS, 2022, 54
[35] HRPE: Hierarchical Relative Positional Encoding for Transformer-Based Structured Symbolic Music Generation
Li, Pengfei
Wu, Jingcheng
Ji, Zihao
MUSIC INTELLIGENCE, SOMI 2023, 2024, 2007 : 122 - 134
[36] HTML']HTML: Hierarchical Transformer-based Multi-task Learning for Volatility Prediction
Yang, Linyi
Ng, Tin Lok James
Smyth, Barry
Dong, Riuhai
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 441 - 451
[37] Error Types in Transformer-Based Paraphrasing Models: A Taxonomy, Paraphrase Annotation Model and Dataset
Berro, Auday
Benatallah, Boualem
Gaci, Yacine
Benabdeslem, Khalid
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT I, ECML PKDD 2024, 2024, 14941 : 332 - 349
[38] A transformer-based low-resolution face recognition method via on-and-offline knowledge distillation
Song, Yaozhe
Tang, Hongying
Meng, Fangzhou
Wang, Chaoyi
Wu, Mengmeng
Shu, Ziting
Tong, Guanjun
NEUROCOMPUTING, 2022, 509 : 193 - 205
[39] DeNKD: Decoupled Non-Target Knowledge Distillation for Complementing Transformer-Based Unsupervised Domain Adaptation
Mei, Zhen
Ye, Peng
Li, Baopu
Chen, Tao
Fan, Jiayuan
Ouyang, Wanli
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3220 - 3231
[40] Breast cancer diagnosis through knowledge distillation of Swin transformer-based teacher-student models
Kolla, Bhavannarayanna
Venugopal, P.
MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2023, 4 (04):

← 1 2 3 4 5 →