Which process metrics can significantly improve defect prediction models? An empirical study

被引:0
|
作者
Lech Madeyski
Marian Jureczko
机构
[1] Wroclaw University of Technology,
来源
Software Quality Journal | 2015年 / 23卷
关键词
Software metrics; Product metrics; Process metrics ; Defect prediction models; Software defect prediction;
D O I
暂无
中图分类号
学科分类号
摘要
The knowledge about the software metrics which serve as defect indicators is vital for the efficient allocation of resources for quality assurance. It is the process metrics, although sometimes difficult to collect, which have recently become popular with regard to defect prediction. However, in order to identify rightly the process metrics which are actually worth collecting, we need the evidence validating their ability to improve the product metric-based defect prediction models. This paper presents an empirical evaluation in which several process metrics were investigated in order to identify the ones which significantly improve the defect prediction models based on product metrics. Data from a wide range of software projects (both, industrial and open source) were collected. The predictions of the models that use only product metrics (simple models) were compared with the predictions of the models which used product metrics, as well as one of the process metrics under scrutiny (advanced models). To decide whether the improvements were significant or not, statistical tests were performed and effect sizes were calculated. The advanced defect prediction models trained on a data set containing product metrics and additionally Number of Distinct Committers (NDC) were significantly better than the simple models without NDC, while the effect size was medium and the probability of superiority (PS) of the advanced models over simple ones was high (p=.016\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p=.016$$\end{document}, r=-.29\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r=-.29$$\end{document}, PS=.76\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {PS}=.76$$\end{document}), which is a substantial finding useful in defect prediction. A similar result with slightly smaller PS was achieved by the advanced models trained on a data set containing product metrics and additionally all of the investigated process metrics (p=.038\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p=.038$$\end{document}, r=-.29\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r=-.29$$\end{document}, PS=.68\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {PS}=.68$$\end{document}). The advanced models trained on a data set containing product metrics and additionally Number of Modified Lines (NML) were significantly better than the simple models without NML, but the effect size was small (p=.038\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p=.038$$\end{document}, r=.06\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r=.06$$\end{document}). Hence, it is reasonable to recommend the NDC process metric in building the defect prediction models.
引用
收藏
页码:393 / 422
页数:29
相关论文
共 50 条
  • [1] Which process metrics can significantly improve defect prediction models? An empirical study
    Madeyski, Lech
    Jureczko, Marian
    [J]. SOFTWARE QUALITY JOURNAL, 2015, 23 (03) : 393 - 422
  • [2] Which Process Metrics Are Significantly Important to Change of Defects in Evolving Projects: An Empirical Study
    Jiang, Li
    Jiang, Shujuan
    Gong, Lina
    Dong, Yue
    Yu, Qiao
    [J]. IEEE ACCESS, 2020, 8 : 93705 - 93722
  • [3] Empirical Investigation of Code and Process Metrics for Defect Prediction
    Han, Wenjing
    Lung, Chung-Horng
    Ajila, Samuel A.
    [J]. 2016 IEEE SECOND INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2016, : 436 - 439
  • [4] An Empirical Study of Software Metrics Diversity for Cross-Project Defect Prediction
    Zhong, Yiwen
    Song, Kun
    Lv, ShengKai
    He, Peng
    [J]. Mathematical Problems in Engineering, 2021, 2021
  • [5] Evaluating Defect Prediction Approaches Using A Massive Set of Metrics: An Empirical Study
    Xuan, Xiao
    Lo, David
    Xia, Xin
    Tian, Yuan
    [J]. 30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 1644 - 1647
  • [6] An Empirical Study on Software Fault Prediction Using Product and Process Metrics
    Shatnawi, Raed
    Mishra, Alok
    [J]. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2021, 14 (01) : 62 - 78
  • [7] Software Defect Prediction Using Process Metrics ElasticSearch Engine Case Study
    Mpofu, Bongeka
    Mnkandla, Enerst
    [J]. 2016 THIRD INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND ENGINEERING (ICACCE 2016), 2016, : 254 - 259
  • [8] An Empirical Study of Model-Agnostic Techniques for Defect Prediction Models
    Jiarpakdee, Jirayus
    Tantithamthavorn, Chakkrit
    Dam, Hoa Khanh
    Grundy, John
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (01) : 166 - 185
  • [9] The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study
    Yu, Qiao
    Jiang, Shujuan
    Zhang, Yanmei
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (02) : 265 - 272
  • [10] A Study on the Significance of Software Metrics in Defect Prediction
    Xia, Ye
    Yan, Guoying
    Si, Qianran
    [J]. 2013 SIXTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2013, : 343 - 346