Which process metrics can significantly improve defect prediction models? An empirical study

被引：0

作者：

Lech Madeyski

Marian Jureczko

机构：

[1] Wroclaw University of Technology,

来源：

Software Quality Journal | 2015年 / 23卷

关键词：

Software metrics; Product metrics; Process metrics ; Defect prediction models; Software defect prediction;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The knowledge about the software metrics which serve as defect indicators is vital for the efficient allocation of resources for quality assurance. It is the process metrics, although sometimes difficult to collect, which have recently become popular with regard to defect prediction. However, in order to identify rightly the process metrics which are actually worth collecting, we need the evidence validating their ability to improve the product metric-based defect prediction models. This paper presents an empirical evaluation in which several process metrics were investigated in order to identify the ones which significantly improve the defect prediction models based on product metrics. Data from a wide range of software projects (both, industrial and open source) were collected. The predictions of the models that use only product metrics (simple models) were compared with the predictions of the models which used product metrics, as well as one of the process metrics under scrutiny (advanced models). To decide whether the improvements were significant or not, statistical tests were performed and effect sizes were calculated. The advanced defect prediction models trained on a data set containing product metrics and additionally Number of Distinct Committers (NDC) were significantly better than the simple models without NDC, while the effect size was medium and the probability of superiority (PS) of the advanced models over simple ones was high (p=.016\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p=.016$$\end{document}, r=-.29\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r=-.29$$\end{document}, PS=.76\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {PS}=.76$$\end{document}), which is a substantial finding useful in defect prediction. A similar result with slightly smaller PS was achieved by the advanced models trained on a data set containing product metrics and additionally all of the investigated process metrics (p=.038\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p=.038$$\end{document}, r=-.29\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r=-.29$$\end{document}, PS=.68\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {PS}=.68$$\end{document}). The advanced models trained on a data set containing product metrics and additionally Number of Modified Lines (NML) were significantly better than the simple models without NML, but the effect size was small (p=.038\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p=.038$$\end{document}, r=.06\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r=.06$$\end{document}). Hence, it is reasonable to recommend the NDC process metric in building the defect prediction models.

引用

页码：393 / 422

页数：29

共 50 条

[1] Which process metrics can significantly improve defect prediction models? An empirical study
Madeyski, Lech
Jureczko, Marian
[J]. SOFTWARE QUALITY JOURNAL, 2015, 23 (03) : 393 - 422
[2] Which Process Metrics Are Significantly Important to Change of Defects in Evolving Projects: An Empirical Study
Jiang, Li
Jiang, Shujuan
Gong, Lina
Dong, Yue
Yu, Qiao
[J]. IEEE ACCESS, 2020, 8 : 93705 - 93722
[3] Empirical Investigation of Code and Process Metrics for Defect Prediction
Han, Wenjing
Lung, Chung-Horng
Ajila, Samuel A.
[J]. 2016 IEEE SECOND INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2016, : 436 - 439
[4] An Empirical Study of Software Metrics Diversity for Cross-Project Defect Prediction
Zhong, Yiwen
Song, Kun
Lv, ShengKai
He, Peng
[J]. Mathematical Problems in Engineering, 2021, 2021
[5] Evaluating Defect Prediction Approaches Using A Massive Set of Metrics: An Empirical Study
Xuan, Xiao
Lo, David
Xia, Xin
Tian, Yuan
[J]. 30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 1644 - 1647
[6] An Empirical Study on Software Fault Prediction Using Product and Process Metrics
Shatnawi, Raed
Mishra, Alok
[J]. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2021, 14 (01) : 62 - 78
[7] Software Defect Prediction Using Process Metrics ElasticSearch Engine Case Study
Mpofu, Bongeka
Mnkandla, Enerst
[J]. 2016 THIRD INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND ENGINEERING (ICACCE 2016), 2016, : 254 - 259
[8] An Empirical Study of Model-Agnostic Techniques for Defect Prediction Models
Jiarpakdee, Jirayus
Tantithamthavorn, Chakkrit
Dam, Hoa Khanh
Grundy, John
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (01) : 166 - 185
[9] The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study
Yu, Qiao
Jiang, Shujuan
Zhang, Yanmei
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (02) : 265 - 272
[10] A Study on the Significance of Software Metrics in Defect Prediction
Xia, Ye
Yan, Guoying
Si, Qianran
[J]. 2013 SIXTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2013, : 343 - 346

← 1 2 3 4 5 →