An empirical study of ensemble techniques for software fault prediction

被引：29

作者：

Rathore, Santosh S. ^{[1
]}

Kumar, Sandeep ^{[2
]}

机构：

[1] ABV Indian Inst Informat Technol & Management, Dept Informat Technol, Gwalior, India

[2] Indian Inst Technol Roorkee, Dept Comp Sci & Engn, Roorkee, Uttar Pradesh, India

来源：

APPLIED INTELLIGENCE | 2021年 / 51卷 / 06期

关键词：

Software fault prediction; Ensemble techniques; PROMISE data repository; Empirical analysis; DEFECT PREDICTION; SEVERITY PREDICTION; MACHINE; NUMBER;

D O I：

10.1007/s10489-020-01935-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Previously, many researchers have performed analysis of various techniques for the software fault prediction (SFP). Oddly, the majority of such studies have shown the limited prediction capability and their performance for given software fault datasets was not persistent. In contrast to this, recently, ensemble techniques based SFP models have shown promising and improved results across different software fault datasets. However, many new as well as improved ensemble techniques have been introduced, which are not explored for SFP. Motivated by this, the paper performs an investigation on ensemble techniques for SFP. We empirically assess the performance of seven ensemble techniques namely, Dagging, Decorate, Grading, MultiBoostAB, RealAdaBoost, Rotation Forest, and Ensemble Selection. We believe that most of these ensemble techniques are not used before for SFP. We conduct a series of experiments on the benchmark fault datasets and use three distinct classification algorithms, namely, naive Bayes, logistic regression, and J48 (decision tree) as base learners to the ensemble techniques. Experimental analysis revealed that rotation forest with J48 as the base learner achieved the highest precision, recall, and G-mean 1 values of 0.995, 0.994, and 0.994, respectively and Decorate achieved the highest AUC value of 0.986. Further, results of statistical tests showed used ensemble techniques demonstrated a statistically significant difference in their performance among the used ones for SFP. Additionally, the cost-benefit analysis showed that SFP models based on used ensemble techniques might be helpful in saving software testing cost and effort for twenty out of twenty-eight used fault datasets.

引用

页码：3615 / 3644

页数：30

共 50 条

[41] Empirical Investigation of Metrics for Fault Prediction on Object-Oriented Software
Goel, Bindu
Singh, Yogesh
COMPUTER AND INFORMATION SCIENCE, 2008, 131 : 255 - 265
[42] Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble Learning
Alsorory, Hanan Sharif
Alshraideh, Mohammad
APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2024, 2024
[43] Applying Swarm Ensemble Clustering Technique for Fault Prediction Using Software Metrics
Coelho, Rodrigo A.
Guimaraes, Fabricio dos R. N.
Esmin, Ahmed A. A.
2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, : 356 - 361
[44] Combining feature selection, feature learning and ensemble learning for software fault prediction
Hung Duy Tran
Le Thi My Hanh
Nguyen Thanh Binh
PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2019), 2019, : 78 - 85
[45] SOFTWARE FAULT PREDICTION
SHERER, SA
JOURNAL OF SYSTEMS AND SOFTWARE, 1995, 29 (02) : 97 - 105
[46] Software fault prediction using machine learning techniques with metric thresholds
Shatnawi, Raed
INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2021, 25 (02) : 159 - 172
[47] Comparison of imputation techniques for efficient prediction of software fault proneness in classes
Sikka, Geeta
Takkar, Arvinder Kaur
Uddin, Moin
World Academy of Science, Engineering and Technology, 2010, 38 : 615 - 618
[48] Empirical assessment of machine learning based software defect prediction techniques
Challagulla, VUB
Bastani, FB
Yen, IL
Paul, RA
WORDS 2005: 10TH IEEE INTERNATIONAL WORKSHOP ON OBJECT-ORIENTED REAL-TIME DEPENDABLE, PROCEEDINGS, 2005, : 263 - 270
[49] Empirical assessment of machine learning based software defect prediction techniques
Challagulla, Venkata Udaya B.
Bastani, Farokh B.
Yen, I-Ling
Paul, Raymond A.
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2008, 17 (02) : 389 - 400
[50] Empirical Assessment of Machine Learning Techniques for Software Requirements Risk Prediction
Naseem, Rashid
Shaukat, Zain
Irfan, Muhammad
Shah, Muhammad Arif
Ahmad, Arshad
Muhammad, Fazal
Glowacz, Adam
Dunai, Larisa
Antonino-Daviu, Jose
Sulaiman, Adel
ELECTRONICS, 2021, 10 (02) : 1 - 19

← 1 2 3 4 5 →