An empirical study of ensemble techniques for software fault prediction

被引:25
|
作者
Rathore, Santosh S. [1 ]
Kumar, Sandeep [2 ]
机构
[1] ABV Indian Inst Informat Technol & Management, Dept Informat Technol, Gwalior, India
[2] Indian Inst Technol Roorkee, Dept Comp Sci & Engn, Roorkee, Uttar Pradesh, India
关键词
Software fault prediction; Ensemble techniques; PROMISE data repository; Empirical analysis; DEFECT PREDICTION; SEVERITY PREDICTION; MACHINE; NUMBER;
D O I
10.1007/s10489-020-01935-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previously, many researchers have performed analysis of various techniques for the software fault prediction (SFP). Oddly, the majority of such studies have shown the limited prediction capability and their performance for given software fault datasets was not persistent. In contrast to this, recently, ensemble techniques based SFP models have shown promising and improved results across different software fault datasets. However, many new as well as improved ensemble techniques have been introduced, which are not explored for SFP. Motivated by this, the paper performs an investigation on ensemble techniques for SFP. We empirically assess the performance of seven ensemble techniques namely, Dagging, Decorate, Grading, MultiBoostAB, RealAdaBoost, Rotation Forest, and Ensemble Selection. We believe that most of these ensemble techniques are not used before for SFP. We conduct a series of experiments on the benchmark fault datasets and use three distinct classification algorithms, namely, naive Bayes, logistic regression, and J48 (decision tree) as base learners to the ensemble techniques. Experimental analysis revealed that rotation forest with J48 as the base learner achieved the highest precision, recall, and G-mean 1 values of 0.995, 0.994, and 0.994, respectively and Decorate achieved the highest AUC value of 0.986. Further, results of statistical tests showed used ensemble techniques demonstrated a statistically significant difference in their performance among the used ones for SFP. Additionally, the cost-benefit analysis showed that SFP models based on used ensemble techniques might be helpful in saving software testing cost and effort for twenty out of twenty-eight used fault datasets.
引用
收藏
页码:3615 / 3644
页数:30
相关论文
共 50 条
  • [1] An empirical study of ensemble techniques for software fault prediction
    Santosh S. Rathore
    Sandeep Kumar
    [J]. Applied Intelligence, 2021, 51 : 3615 - 3644
  • [2] The Impact of Ensemble Techniques on Software Maintenance Change Prediction: An Empirical Study
    Alsolai, Hadeel
    Roper, Marc
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (10):
  • [3] An empirical study of some software fault prediction techniques for the number of faults prediction
    Rathore, Santosh S.
    Kumar, Sandeep
    [J]. SOFT COMPUTING, 2017, 21 (24) : 7417 - 7434
  • [4] An empirical study of some software fault prediction techniques for the number of faults prediction
    Santosh S. Rathore
    Sandeep Kumar
    [J]. Soft Computing, 2017, 21 : 7417 - 7434
  • [5] A study on software fault prediction techniques
    Rathore, Santosh S.
    Kumar, Sandeep
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2019, 51 (02) : 255 - 327
  • [6] A study on software fault prediction techniques
    Santosh S. Rathore
    Sandeep Kumar
    [J]. Artificial Intelligence Review, 2019, 51 : 255 - 327
  • [7] A Three-Stage Based Ensemble Learning for Improved Software Fault Prediction: An Empirical Comparative Study
    Yohannese, Chubato Wondaferaw
    Li, Tianrui
    Bashir, Kamal
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2018, 11 (01) : 1229 - 1247
  • [8] A three-stage based ensemble learning for improved software fault prediction: An empirical comparative study
    Yohannese C.W.
    Li T.
    Bashir K.
    [J]. International Journal of Computational Intelligence Systems, 2018, Atlantis Press (11) : 1229 - 1247
  • [9] A comprehensive empirical study of count models for software fault prediction
    Gao, Kehan
    Khoshgoftaar, Taghi M.
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2007, 56 (02) : 223 - 236
  • [10] A sequential ensemble model for software fault prediction
    Monika Mangla
    Nonita Sharma
    Sachi Nandan Mohanty
    [J]. Innovations in Systems and Software Engineering, 2022, 18 : 301 - 308