Influence Analysis Method of Class Imbalance on Software Defect Prediction Model Stability and Prediction Performance

被引:0
|
作者
Zhang, Yan-Mei [1 ,2 ]
Zhi, Sheng-Lin [3 ]
Jiang, Shu-Juan [1 ,2 ]
Yuan, Guan [1 ,2 ]
机构
[1] Mine Digitization Engineering Research Center, The Ministry of Education, China University of Mining and Technology, Jiangsu, Xuzhou,221116, China
[2] School of Computer Science and Technology, China University of Mining and Technology, Jiangsu, Xuzhou,221116, China
[3] KeHua Data Co.,Ltd, Guangdong, Shenzhen,518055, China
来源
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Classification (of information) - Defects - Forecasting;
D O I
10.12263/DZXB.20210911
中图分类号
学科分类号
摘要
The paper proposes a method for analyzing the influence of class imbalance on software defect prediction model stability and prediction performance. Firstly, the original data set is constructed into a set of new data sets whose unbalance rate is less than the original data set's unbalance rate by using the undersampling method. Where, fixed seeds are used in the construction of the data set to ensure that the data in the same unbalanced rate data set constructed by the same data set is the same, so as to reduce the randomness of the results of each run. Secondly, the MCC value is taken as the performance evaluation indicator of the prediction model, and the new data set generated each time is put into the classification algorithm of the model for training and prediction evaluation, so as to obtain the MCC value at different unbalanced rate for the current data set. We also propose a performance stability evaluation indicator. The experimental results show that, MCC is more suitable as the stability evaluation indicator of software defect prediction model under the condition of class imbalance compared with AUC. For the stability of software defect prediction performance, the cost sensitive model performs better than the ensemble model. © 2023 Chinese Institute of Electronics. All rights reserved.
引用
收藏
页码:2076 / 2087
相关论文
共 50 条
  • [21] An empirical study toward dealing with noise and class imbalance issues in software defect prediction
    Pandey, Sushant Kumar
    Tripathi, Anil Kumar
    [J]. SOFT COMPUTING, 2021, 25 (21) : 13465 - 13492
  • [22] An empirical study toward dealing with noise and class imbalance issues in software defect prediction
    Sushant Kumar Pandey
    Anil Kumar Tripathi
    [J]. Soft Computing, 2021, 25 : 13465 - 13492
  • [23] Hellinger Net: A Hybrid Imbalance Learning Model to Improve Software Defect Prediction
    Chakraborty, Tanujit
    Chakraborty, Ashis Kumar
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2021, 70 (02) : 481 - 494
  • [24] Class Imbalance in Software Fault Prediction Data Set
    Arun, C.
    Lakshmi, C.
    [J]. ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, 2020, 1056 : 745 - 757
  • [25] Handling Class-Imbalance with KNN (Neighbourhood) Under-Sampling for Software Defect Prediction
    Goyal, Somya
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (03) : 2023 - 2064
  • [26] MAHAKIL: Diversity Based Oversampling Approach to Alleviate the Class Imbalance Issue in Software Defect Prediction
    Benni, Kwabena Ebo
    Keung, Jacky
    Phannachitta, Passakorn
    Monden, Akito
    Mensah, Solomon
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2018, 44 (06) : 534 - 550
  • [27] Handling Class-Imbalance with KNN (Neighbourhood) Under-Sampling for Software Defect Prediction
    Somya Goyal
    [J]. Artificial Intelligence Review, 2022, 55 : 2023 - 2064
  • [28] An Empirical Study on Data Sampling Methods in Addressing Class Imbalance Problem in Software Defect Prediction
    Odejide, Babajide J.
    Bajeh, Amos O.
    Balogun, Abdullateef O.
    Alanamu, Zubair O.
    Adewole, Kayode S.
    Akintola, Abimbola G.
    Salihu, Shakirat A.
    Usman-Hamza, Fatima E.
    Mojeed, Hammed A.
    [J]. SOFTWARE ENGINEERING PERSPECTIVES IN SYSTEMS, VOL. 1, 2022, 501 : 594 - 610
  • [29] A defect prediction method for software versioning
    Kastro, Yomi
    Bener, Ayse Basar
    [J]. SOFTWARE QUALITY JOURNAL, 2008, 16 (04) : 543 - 562
  • [30] A defect prediction method for software versioning
    Yomi Kastro
    Ayşe Basar Bener
    [J]. Software Quality Journal, 2008, 16 : 543 - 562