An empirical study of ensemble techniques for software fault prediction

被引:29
|
作者
Rathore, Santosh S. [1 ]
Kumar, Sandeep [2 ]
机构
[1] ABV Indian Inst Informat Technol & Management, Dept Informat Technol, Gwalior, India
[2] Indian Inst Technol Roorkee, Dept Comp Sci & Engn, Roorkee, Uttar Pradesh, India
关键词
Software fault prediction; Ensemble techniques; PROMISE data repository; Empirical analysis; DEFECT PREDICTION; SEVERITY PREDICTION; MACHINE; NUMBER;
D O I
10.1007/s10489-020-01935-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previously, many researchers have performed analysis of various techniques for the software fault prediction (SFP). Oddly, the majority of such studies have shown the limited prediction capability and their performance for given software fault datasets was not persistent. In contrast to this, recently, ensemble techniques based SFP models have shown promising and improved results across different software fault datasets. However, many new as well as improved ensemble techniques have been introduced, which are not explored for SFP. Motivated by this, the paper performs an investigation on ensemble techniques for SFP. We empirically assess the performance of seven ensemble techniques namely, Dagging, Decorate, Grading, MultiBoostAB, RealAdaBoost, Rotation Forest, and Ensemble Selection. We believe that most of these ensemble techniques are not used before for SFP. We conduct a series of experiments on the benchmark fault datasets and use three distinct classification algorithms, namely, naive Bayes, logistic regression, and J48 (decision tree) as base learners to the ensemble techniques. Experimental analysis revealed that rotation forest with J48 as the base learner achieved the highest precision, recall, and G-mean 1 values of 0.995, 0.994, and 0.994, respectively and Decorate achieved the highest AUC value of 0.986. Further, results of statistical tests showed used ensemble techniques demonstrated a statistically significant difference in their performance among the used ones for SFP. Additionally, the cost-benefit analysis showed that SFP models based on used ensemble techniques might be helpful in saving software testing cost and effort for twenty out of twenty-eight used fault datasets.
引用
收藏
页码:3615 / 3644
页数:30
相关论文
共 50 条
  • [31] Empirical Study on Software Bug Prediction
    Rizwan, Syed
    Wang Tiantian
    Su Xiaohong
    Salahuddin
    2017 INTERNATIONAL CONFERENCE ON SOFTWARE AND E-BUSINESS (ICSEB 2017), 2015, : 55 - 59
  • [32] Two staged data preprocessing ensemble model for software fault prediction
    Elahi, Ehsan
    Ayub, Amber
    Hussain, Irfan
    PROCEEDINGS OF 2021 INTERNATIONAL BHURBAN CONFERENCE ON APPLIED SCIENCES AND TECHNOLOGIES (IBCAST), 2021, : 506 - 511
  • [33] Rough Noise-Filtered Easy Ensemble for software Fault Prediction
    Riaz, Saman
    Arshad, Ali
    Jiao, Licheng
    IEEE ACCESS, 2018, 6 : 46886 - 46899
  • [34] An empirical study of data sampling techniques for just-in-time software defect prediction
    Li, Zhiqiang
    Du, Qiannan
    Zhang, Hongyu
    Jing, Xiao-Yuan
    Wu, Fei
    AUTOMATED SOFTWARE ENGINEERING, 2024, 31 (02)
  • [35] An Empirical Study on Application of Word Embedding Techniques for Prediction of Software Defect Severity Level
    Kumar, Lov
    Kumar, Mukesh
    Murthy, Lalita Bhanu
    Misra, Sanjay
    Kocher, Vipul
    Padmanabhuni, Srinivas
    PROCEEDINGS OF THE 2021 16TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS), 2021, : 477 - 484
  • [36] Fault Prediction Model for Software Using Soft Computing Techniques
    Nisa, Ishrat Un
    Ahsan, Syed Nadeem
    2015 INTERNATIONAL CONFERENCE ON OPEN SOURCE SYSTEMS & TECHNOLOGIES (ICOSST), 2015, : 78 - 83
  • [37] A systematic review of machine learning techniques for software fault prediction
    Malhotra, Ruchika
    APPLIED SOFT COMPUTING, 2015, 27 : 504 - 518
  • [38] Improved prediction of software defects using ensemble machine learning techniques
    Sweta Mehta
    K. Sridhar Patnaik
    Neural Computing and Applications, 2021, 33 : 10551 - 10562
  • [39] Improved prediction of software defects using ensemble machine learning techniques
    Mehta, Sweta
    Patnaik, K. Sridhar
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (16): : 10551 - 10562
  • [40] An Empirical Analysis on Effective Fault Prediction Model Developed using Ensemble Methods
    Kumar, Lov
    Rath, Santanu
    Sureka, Ashish
    2017 IEEE 41ST ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2017, : 244 - 249