Predicting Vulnerable Components via Text Mining or Software Metrics? An Effort-aware Perspective

被引:23
|
作者
Tang, Yaming [1 ]
Zhao, Fei [1 ]
Yang, Yibiao [1 ]
Lu, Hongmin [1 ]
Zhou, Yuming [1 ]
Xu, Baowen [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China
关键词
software metrics; text mining; vulnerability; prediction; effort-aware;
D O I
10.1109/QRS.2015.15
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In order to identify vulnerable software components, developers can take software metrics as predictors or use text mining techniques to build vulnerability prediction models. A recent study reported that text mining based models have higher recall than software metrics based models. However, this conclusion was drawn without considering the sizes of individual components which affects the code inspection effort to determine whether a component is vulnerable. In this paper, we investigate the predictive power of these two kinds of prediction models in the context of effort-aware vulnerability prediction. To this end, we use the same data sets, containing 223 vulnerabilities found in three web applications, to build vulnerability prediction models. The experimental results show that: (1) in the context of effort-aware ranking scenario, text mining based models only slightly outperform software metrics based models; (2) in the context of effort-aware classification scenario, text mining based models perform similarly to software metrics based models in most cases; and (3) most of the effect sizes (i.e. the magnitude of the differences) between these two kinds of models are trivial. These results suggest that, from the viewpoint of practical application, software metrics based models are comparable to text mining based models. Therefore, for developers, software metrics based models are practical choices for vulnerability prediction, as the cost to build and apply these models is much lower.
引用
收藏
页码:27 / 36
页数:10
相关论文
共 39 条
  • [21] Detecting vulnerable software functions via text and dependency features
    Xu, Wenlin
    Li, Tong
    Wang, Jinsong
    Tang, Yahui
    [J]. SOFT COMPUTING, 2023, 27 (09) : 5425 - 5435
  • [22] Detecting vulnerable software functions via text and dependency features
    Wenlin Xu
    Tong Li
    Jinsong Wang
    Yahui Tang
    [J]. Soft Computing, 2023, 27 : 5425 - 5435
  • [23] Using Software Metrics for Predicting Vulnerable Code-Components: A Study on Java']Java and Python']Python Open Source Projects
    Chong, Tai-Yin
    Anu, Vaibhav
    Sultana, Kazi Zakia
    [J]. 2019 22ND IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (IEEE CSE 2019) AND 17TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (IEEE EUC 2019), 2019, : 98 - 103
  • [24] Early Identification of Vulnerable Software Components via Ensemble Learning
    Pang, Yulei
    Xue, Xiaozhen
    Namin, Akbar Siami
    [J]. 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 476 - 481
  • [25] An Empirical Investigation of Predicting Fault Count, Fix Cost and Effort Using Software Metrics
    Shatnawi, Raed
    Li, Wei
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (02) : 484 - 491
  • [26] A Text Mining Framework for Analyzing Change Impact and Maintenance Effort of Software Bug Reports
    Malhotra, Ruchika
    Khanna, Megha
    [J]. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2022, 12 (01)
  • [27] Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
    Yang, Yibiao
    Zhou, Yuming
    Lu, Hongmin
    Chen, Lin
    Chen, Zhenyu
    Xu, Baowen
    Leung, Hareton
    Zhang, Zhenyu
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2015, 41 (04) : 331 - 357
  • [28] CoreBug: Improving Effort-Aware Bug Prediction in Software Systems Using Generalized k-Core Decomposition in Class Dependency Networks
    Du, Xin
    Wang, Tian
    Wang, Liuhai
    Pan, Weifeng
    Chai, Chunlai
    Xu, Xinxin
    Jiang, Bo
    Wang, Jiale
    [J]. AXIOMS, 2022, 11 (05)
  • [29] Predicting Software Maintenance Effort by Mining Software Project Reports Using Inter-Version Validation
    Jindal, Rajni
    Malhotra, Ruchika
    Jain, Abha
    [J]. INTERNATIONAL JOURNAL OF RELIABILITY QUALITY AND SAFETY ENGINEERING, 2016, 23 (06)
  • [30] Using software metrics for predicting vulnerable classes in java']java and python']python based systems
    Sultana, Kazi Zakia
    Anu, Vaibhav
    Chong, Tai-Yin
    [J]. INFORMATION SECURITY JOURNAL, 2024, 33 (03): : 251 - 267