LTRWES: A new framework for security bug report detection

被引：16

作者：

Jiang, Yuan ^{[1
]}

Lu, Pengcheng ^{[1
]}

Su, Xiaohong ^{[1
]}

Wang, Tiantian ^{[1
]}

机构：

[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Heilongjiang, Peoples R China

来源：

INFORMATION AND SOFTWARE TECHNOLOGY | 2020年 / 124卷

基金：

中国国家自然科学基金;

关键词：

Security bug report; Content-based filtering; Word embedding; Machine learning; CLASSIFICATION; PREDICTION;

D O I：

10.1016/j.infsof.2020.106314

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Context: Security bug reports (SBRs) usually contain security-related vulnerabilities in software products, which could be exploited by malicious attackers. Hence, it is important to identify SBRs quickly and accurately among bug reports (BRs) that have been disclosed in bug tracking systems. Although a few methods have been already proposed for the detection of SBRs, challenging issues still remain due to noisy samples, class imbalance and data scarcity. Object: This motivates us to reveal the potential challenges faced by the state-of-the-art SBRs prediction methods from the viewpoint of data filtering and representation. Furthermore, the purpose of this paper is also to provide a general framework and new solutions to solve these problems. Method: In this study, we propose a novel approach LTRWES that incorporates learning to rank and word embedding into the identification of SBRs. Unlike previous keyword-based approaches, LTRWES is a content-based data filtering and representation framework that has several desirable properties not shared in other methods. Firstly, it exploits ranking model to efficiently filter non-security bug reports (NSBRs) that have higher content similarity with respect to SBRs. Secondly, it applies word embedding technology to transform the rest of NSBRs, together with SBRs, into low-dimensional real-value vectors. Result: Experiment results on benchmark and large real-world datasets show that our proposed method outperforms the state-of-the-art method. Conclusion: Overall, the LTRWES is valid with high performance. It will help security engineers to identify SBRs from thousands of NSBRs more accurately than existing algorithms. Therefore, this will positively encourage the research and development of the content-based methods for security bug report detection.

引用

页数：17

共 50 条

[1] Security Bug Report Detection Via Noise Filtering and Deep Learning
Jiang, Yuan
Mu, Chen-Guang
Su, Xiao-Hong
Wang, Tian-Tian
[J]. Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (08): : 1794 - 1813
[2] A Unified Framework for Bug Report Assignment
Zhao, Yuan
He, Tieke
Chen, Zhenyu
[J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2019, 29 (04) : 607 - 628
[3] A New Method of Security Bug Reports Analysis
Xu, Yunwu
Li, Yan
[J]. IT PROFESSIONAL, 2024, 26 (02) : 49 - 56
[4] Text Filtering and Ranking for Security Bug Report Prediction
Peters, Fayola
Tun, Thein Than
Yu, Yijun
Nuseibeh, Bashar
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2019, 45 (06) : 615 - 631
[5] A New Framework of Security Vulnerabilities Detection in PHP Web Application
Zhao, Jingling
Gong, Rulin
[J]. 2015 9TH INTERNATIONAL CONFERENCE ON INNOVATIVE MOBILE AND INTERNET SERVICES IN UBIQUITOUS COMPUTING IMIS 2015, 2015, : 271 - 276
[6] A Systematic Study of Duplicate Bug Report Detection
Gupta, Som
Gupta, Sanjai Kumar
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (01) : 578 - 589
[7] Duplicate Bug Report Detection Using Clustering
Gopalan, Raj P.
Krishna, Aneesh
[J]. 2014 23RD AUSTRALASIAN SOFTWARE ENGINEERING CONFERENCE (ASWEC), 2013, : 104 - 109
[8] A multi-model framework for semantically enhancing detection of quality-related bug report descriptions
Krasniqi, Rrezarta
Do, Hyunsook
[J]. EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (02)
[9] Reformulating Queries for Duplicate Bug Report Detection
Chaparro, Oscar
Florez, Juan Manuel
Singh, Unnati
Marcus, Andrian
[J]. 2019 IEEE 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), 2019, : 218 - 229
[10] A multi-model framework for semantically enhancing detection of quality-related bug report descriptions
Rrezarta Krasniqi
Hyunsook Do
[J]. Empirical Software Engineering, 2023, 28

← 1 2 3 4 5 →