Comparative analysis of impact of classification algorithms on security and performance bug reports

被引:0
|
作者
Said, Maryyam [2 ]
Bin Faiz, Rizwan [2 ]
Aljaidi, Mohammad [1 ]
Alshammari, Muteb [3 ]
机构
[1] Zarqa Univ, Fac Informat Technol, Dept Comp Sci, Zarqa 13116, Jordan
[2] Riphah Int Univ, Fac Comp, Islamabad 46000, Pakistan
[3] Northern Border Univ, Fac Comp & Informat Technol, Dept Informat Technol, Rafha 91431, Saudi Arabia
关键词
bug classification; security bug; performance bug; text mining; bug prediction;
D O I
10.1515/jisys-2024-0045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identification and classification of bugs, e.g., security and performance are a preemptive and fundamental practice which contributes to the development of secure and efficient software. Software Quality Assurance (SQA) needs to classify bugs into relevant categories, e.g., security and performance bugs since one type of bug may have a higher preference over another, thus facilitating software evolution and maintenance. In addition to classification, it would be ideal for the SQA manager to prioritize security and performance bugs based on the level of perseverance, severity, or impact to assign relevant developers whose expertise is aligned with the identification of such bugs, thus facilitating triaging. The aim of this research is to compare and analyze the prediction accuracy of machine learning algorithms, i.e., Artificial neural network (ANN), Support vector machine (SVM), Na & iuml;ve Bayes (NB), Decision tree (DT), Logistic regression (LR), and K-nearest neighbor (KNN) to identify security and performance bugs from the bug repository. We first label the existing dataset from the Bugzilla repository with the help of a software security expert to train the algorithms. Our research type is explanatory, and our research method is controlled experimentation, in which the independent variable is prediction accuracy and the dependent variables are ANN, SVM, NB, DT, LR, and KNN. First, we applied preprocessing, Term Frequency-Inverse Document Frequency feature extraction methods, and then applied classification algorithms. The results were measured through accuracy, precision, recall, and F-measure and then the results were compared and validated through the ten-fold cross-validation technique. Comparative analysis reveals that two algorithms (SVM and LR) perform better in terms of precision (0.99) for performance bugs and three algorithms (SVM, ANN, and LR) perform better in terms of F1 score for security bugs as compared to other classification algorithms which are essentially due to the linear dataset and extensive number of features in the dataset.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study
    Kusonmano, Kanthida
    Netzer, Michael
    Baumgartner, Christian
    Dehmer, Matthias
    Liedl, Klaus R.
    Graber, Armin
    SCIENTIFIC WORLD JOURNAL, 2012,
  • [42] Machine learning algorithms in microbial classification: a comparative analysis
    Wu, Yuandi
    Gadsden, S. Andrew
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
  • [43] Comparative Analysis of Multi-label Classification Algorithms
    Sharma, Seema
    Mehrotra, Deepti
    2018 FIRST INTERNATIONAL CONFERENCE ON SECURE CYBER COMPUTING AND COMMUNICATIONS (ICSCCC 2018), 2018, : 35 - 38
  • [44] Comparative Analysis of HAR Datasets Using Classification Algorithms
    Nayak, Suvra
    Panigrahi, Chhabi
    Pati, Bibudhendu
    Nanda, Sarmistha
    Hsieh, Meng-Yen
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2022, 19 (01) : 47 - 63
  • [45] Machine Learning Algorithms for Document Classification: Comparative Analysis
    Rashid, Faizur
    Gargaare, Suleiman M. A.
    Aden, Abdulkadir H.
    Abdi, Afendi
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (04) : 260 - 265
  • [46] Performance Analysis of Classification Algorithms on Birth Dataset
    Rehman, Aqeel Ur (rehmancqu@gmail.com), 1600, Institute of Electrical and Electronics Engineers Inc., United States (08):
  • [47] A Comparative Analysis of Classification Algorithms in Diabetic Retinopathy Screening
    Mohammadian, Saboora
    Karsaz, Ali
    Roshan, Yaser M.
    PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2017, : 84 - 89
  • [48] Performance Analysis of Classification Algorithms on Birth Dataset
    Abbas, Syed Ali
    Rehman, Aqeel Ur
    Majeed, Fiaz
    Majid, Abdul
    Malik, M. Sheraz Arshed
    Kazmi, Zaki Hassan
    Zafar, Seemab
    IEEE ACCESS, 2020, 8 : 102146 - 102154
  • [49] Comparative Analysis of Clustering Algorithms Applied to the Classification of Bugs
    Santana, Anderson
    Silva, Jackson
    Muniz, Patricia
    Araujo, Fabricio
    de Souza, Renata Maria Cardoso R.
    NEURAL INFORMATION PROCESSING, ICONIP 2012, PT V, 2012, 7667 : 592 - 598
  • [50] Comparative Analysis of Different Machine Learning Algorithms in Classification
    Wang, Lincong
    Xu, Weiwen
    Zhu, Zhenghao
    2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022), 2022, : 257 - 263