Software Defect Prediction for Healthcare Big Data: An Empirical Evaluation of Machine Learning Techniques

被引:27
|
作者
Khan, Bilal [1 ]
Naseem, Rashid [2 ]
Shah, Muhammad Arif [2 ]
Wakil, Karzan [3 ]
Khan, Atif [4 ]
Uddin, M. Irfan [5 ]
Mahmoud, Marwan [6 ]
机构
[1] City Univ Sci & Informat Technol, Dept Comp Sci, Peshawar 25000, Pakistan
[2] Pak Austria Fachhsch Inst Appl Sci & Technol, Dept IT & Comp Sci, Haripur, Pakistan
[3] Sulaimani Polytech Univ, Res Ctr, Sulimani 46001, Kurdistan Regio, Iraq
[4] Islamia Coll, Dept Comp Sci, Peshawar 2500, Pakistan
[5] Kohat Univ Sci & Technol, Inst Comp, Kohat, Pakistan
[6] King Abdulaziz Univ, Fac Appl Studies, Jeddah, Saudi Arabia
关键词
DECISION-SUPPORT-SYSTEM; ATTRIBUTES; ALGORITHM;
D O I
10.1155/2021/8899263
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Software defect prediction (SDP) in the initial period of the software development life cycle (SDLC) remains a critical and important assignment. SDP is essentially studied during few last decades as it leads to assure the quality of software systems. The quick forecast of defective or imperfect artifacts in software development may serve the development team to use the existing assets competently and more effectively to provide extraordinary software products in the given or narrow time. Previously, several canvassers have industrialized models for defect prediction utilizing machine learning (ML) and statistical techniques. ML methods are considered as an operative and operational approach to pinpoint the defective modules, in which moving parts through mining concealed patterns amid software metrics (attributes). ML techniques are also utilized by several researchers on healthcare datasets. This study utilizes different ML techniques software defect prediction using seven broadly used datasets. The ML techniques include the multilayer perceptron (MLP), support vector machine (SVM), decision tree (J48), radial basis function (RBF), random forest (RF), hidden Markov model (HMM), credal decision tree (CDT), K-nearest neighbor (KNN), average one dependency estimator (A1DE), and Naive Bayes (NB). The performance of each technique is evaluated using different measures, for instance, relative absolute error (RAE), mean absolute error (MAE), root mean squared error (RMSE), root relative squared error (RRSE), recall, and accuracy. The inclusive outcome shows the best performance of RF with 88.32% average accuracy and 2.96 rank value, second-best performance is achieved by SVM with 87.99% average accuracy and 3.83 rank values. Moreover, CDT also shows 87.88% average accuracy and 3.62 rank values, placed on the third position. The comprehensive outcomes of research can be utilized as a reference point for new research in the SDP domain, and therefore, any assertion concerning the enhancement in prediction over any new technique or model can be benchmarked and proved.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Empirical assessment of machine learning based software defect prediction techniques
    Challagulla, VUB
    Bastani, FB
    Yen, IL
    Paul, RA
    [J]. WORDS 2005: 10TH IEEE INTERNATIONAL WORKSHOP ON OBJECT-ORIENTED REAL-TIME DEPENDABLE, PROCEEDINGS, 2005, : 263 - 270
  • [2] Empirical assessment of machine learning based software defect prediction techniques
    Challagulla, Venkata Udaya B.
    Bastani, Farokh B.
    Yen, I-Ling
    Paul, Raymond A.
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2008, 17 (02) : 389 - 400
  • [3] An empirical framework for defect prediction using machine learning techniques with Android software
    Malhotra, Ruchika
    [J]. APPLIED SOFT COMPUTING, 2016, 49 : 1034 - 1050
  • [4] Performance evaluation of software defect prediction with NASA dataset using machine learning techniques
    Siddiqui T.
    Mustaqeem M.
    [J]. International Journal of Information Technology, 2023, 15 (8) : 4131 - 4139
  • [5] Software Defect Prediction Analysis Using Machine Learning Techniques
    Khalid, Aimen
    Badshah, Gran
    Ayub, Nasir
    Shiraz, Muhammad
    Ghouse, Mohamed
    [J]. SUSTAINABILITY, 2023, 15 (06)
  • [6] Software Defect Prediction on Unlabelled Dataset with Machine Learning Techniques
    Ronchieri, Elisabetta
    Canaparo, Marco
    Belgiovine, Mauro
    Salomoni, Davide
    [J]. 2019 IEEE NUCLEAR SCIENCE SYMPOSIUM AND MEDICAL IMAGING CONFERENCE (NSS/MIC), 2019,
  • [7] A study on software metrics based software defect prediction using data mining and machine learning techniques
    Prasad, Manjula C.M.
    Florence, Lilly
    Arya, Arti
    [J]. International Journal of Database Theory and Application, 2015, 8 (03): : 179 - 190
  • [8] Class Imbalance Issue in Software Defect Prediction Models by various Machine Learning Techniques: An Empirical Study
    Pandey, Sushant Kumar
    Tripathi, Anil Kumar
    [J]. 2021 8TH INTERNATIONAL CONFERENCE ON SMART COMPUTING AND COMMUNICATIONS (ICSCC), 2021, : 58 - 63
  • [9] An Empirical Evaluation of Machine Learning Techniques for Crop Prediction
    Mariammal, G.
    Suruliandi, A.
    Raja, S. P.
    Poongothai, E.
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2023, 8 (04): : 96 - 104
  • [10] Empirical Assessment of Machine Learning Techniques for Software Requirements Risk Prediction
    Naseem, Rashid
    Shaukat, Zain
    Irfan, Muhammad
    Shah, Muhammad Arif
    Ahmad, Arshad
    Muhammad, Fazal
    Glowacz, Adam
    Dunai, Larisa
    Antonino-Daviu, Jose
    Sulaiman, Adel
    [J]. ELECTRONICS, 2021, 10 (02) : 1 - 19