Predicting Vulnerable Components via Text Mining or Software Metrics? An Effort-aware Perspective

被引：23

作者：

Tang, Yaming ^{[1
]}

Zhao, Fei ^{[1
]}

Yang, Yibiao ^{[1
]}

Lu, Hongmin ^{[1
]}

Zhou, Yuming ^{[1
]}

Xu, Baowen ^{[1
]}

机构：

[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE SECURITY AND RELIABILITY (QRS 2015) | 2015年

关键词：

software metrics; text mining; vulnerability; prediction; effort-aware;

D O I：

10.1109/QRS.2015.15

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In order to identify vulnerable software components, developers can take software metrics as predictors or use text mining techniques to build vulnerability prediction models. A recent study reported that text mining based models have higher recall than software metrics based models. However, this conclusion was drawn without considering the sizes of individual components which affects the code inspection effort to determine whether a component is vulnerable. In this paper, we investigate the predictive power of these two kinds of prediction models in the context of effort-aware vulnerability prediction. To this end, we use the same data sets, containing 223 vulnerabilities found in three web applications, to build vulnerability prediction models. The experimental results show that: (1) in the context of effort-aware ranking scenario, text mining based models only slightly outperform software metrics based models; (2) in the context of effort-aware classification scenario, text mining based models perform similarly to software metrics based models in most cases; and (3) most of the effect sizes (i.e. the magnitude of the differences) between these two kinds of models are trivial. These results suggest that, from the viewpoint of practical application, software metrics based models are comparable to text mining based models. Therefore, for developers, software metrics based models are practical choices for vulnerability prediction, as the cost to build and apply these models is much lower.

引用

页码：27 / 36

页数：10

共 39 条

[21] Detecting vulnerable software functions via text and dependency features
Xu, Wenlin
Li, Tong
Wang, Jinsong
Tang, Yahui
[J]. SOFT COMPUTING, 2023, 27 (09) : 5425 - 5435
[22] Detecting vulnerable software functions via text and dependency features
Wenlin Xu
Tong Li
Jinsong Wang
Yahui Tang
[J]. Soft Computing, 2023, 27 : 5425 - 5435
[23] Using Software Metrics for Predicting Vulnerable Code-Components: A Study on Java']Java and Python']Python Open Source Projects
Chong, Tai-Yin
Anu, Vaibhav
Sultana, Kazi Zakia
[J]. 2019 22ND IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (IEEE CSE 2019) AND 17TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (IEEE EUC 2019), 2019, : 98 - 103
[24] Early Identification of Vulnerable Software Components via Ensemble Learning
Pang, Yulei
Xue, Xiaozhen
Namin, Akbar Siami
[J]. 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 476 - 481
[25] An Empirical Investigation of Predicting Fault Count, Fix Cost and Effort Using Software Metrics
Shatnawi, Raed
Li, Wei
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (02) : 484 - 491
[26] A Text Mining Framework for Analyzing Change Impact and Maintenance Effort of Software Bug Reports
Malhotra, Ruchika
Khanna, Megha
[J]. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2022, 12 (01)
[27] Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
Yang, Yibiao
Zhou, Yuming
Lu, Hongmin
Chen, Lin
Chen, Zhenyu
Xu, Baowen
Leung, Hareton
Zhang, Zhenyu
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2015, 41 (04) : 331 - 357
[28] CoreBug: Improving Effort-Aware Bug Prediction in Software Systems Using Generalized k-Core Decomposition in Class Dependency Networks
Du, Xin
Wang, Tian
Wang, Liuhai
Pan, Weifeng
Chai, Chunlai
Xu, Xinxin
Jiang, Bo
Wang, Jiale
[J]. AXIOMS, 2022, 11 (05)
[29] Predicting Software Maintenance Effort by Mining Software Project Reports Using Inter-Version Validation
Jindal, Rajni
Malhotra, Ruchika
Jain, Abha
[J]. INTERNATIONAL JOURNAL OF RELIABILITY QUALITY AND SAFETY ENGINEERING, 2016, 23 (06)
[30] Using software metrics for predicting vulnerable classes in java']java and python']python based systems
Sultana, Kazi Zakia
Anu, Vaibhav
Chong, Tai-Yin
[J]. INFORMATION SECURITY JOURNAL, 2024, 33 (03): : 251 - 267

← 1 2 3 4 →