Using Class Imbalance Learning for Software Defect Prediction

被引:376
|
作者
Wang, Shuo [1 ]
Yao, Xin [1 ]
机构
[1] Univ Birmingham, Sch Comp Sci, CERCIA, Birmingham B15 2TT, W Midlands, England
基金
英国工程与自然科学研究理事会;
关键词
Class imbalance learning; ensemble learning; negative correlation learning; software defect prediction; STATIC CODE ATTRIBUTES; NEURAL-NETWORKS; MACHINE;
D O I
10.1109/TR.2013.2259203
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To facilitate software testing, and save testing costs, a wide range of machine learning methods have been studied to predict defects in software modules. Unfortunately, the imbalanced nature of this type of data increases the learning difficulty of such a task. Class imbalance learning specializes in tackling classification problems with imbalanced distributions, which could be helpful for defect prediction, but has not been investigated in depth so far. In this paper, we study the issue of if and how class imbalance learning methods can benefit software defect prediction with the aim of finding better solutions. We investigate different types of class imbalance learning methods, including resampling techniques, threshold moving, and ensemble algorithms. Among those methods we studied, AdaBoost.NC shows the best overall performance in terms of the measures including balance, G-mean, and Area Under the Curve (AUC). To further improve the performance of the algorithm, and facilitate its use in software defect prediction, we propose a dynamic version of AdaBoost. NC, which adjusts its parameter automatically during training. Without the need to pre-define any parameters, it is shown to be more effective and efficient than the original AdaBoost. NC.
引用
收藏
页码:434 / 443
页数:10
相关论文
共 50 条
  • [1] Class Imbalance Learning to Heterogeneous Cross-Software Projects Defect Prediction
    Vashisht, Rohit
    Rizvi, Syed Afzal Murtaza
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2022, 10 (01)
  • [2] Prediction of Defective Software Modules Using Class Imbalance Learning
    Tomar, Divya
    Agarwal, Sonali
    [J]. APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2016, 2016
  • [3] Class Imbalance Data-Generation for Software Defect Prediction
    Li, Zheng
    Zhang, Xingyao
    Guo, Junxia
    Shang, Ying
    [J]. 2019 26TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC), 2019, : 276 - 283
  • [4] Tackling class overlap and imbalance problems in software defect prediction
    Lin Chen
    Bin Fang
    Zhaowei Shang
    Yuanyan Tang
    [J]. Software Quality Journal, 2018, 26 : 97 - 125
  • [5] Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance
    Bejjanki, Kiran Kumar
    Gyani, Jayadev
    Gugulothu, Narsimha
    [J]. SYMMETRY-BASEL, 2020, 12 (03):
  • [6] Tackling class overlap and imbalance problems in software defect prediction
    Chen, Lin
    Fang, Bin
    Shang, Zhaowei
    Tang, Yuanyan
    [J]. SOFTWARE QUALITY JOURNAL, 2018, 26 (01) : 97 - 125
  • [7] SOFTWARE DEFECT PREDICTION: ANALYSIS OF CLASS IMBALANCE AND PERFORMANCE STABILITY
    Balogun, Abdullateef O.
    Basri, Shuib
    Abdulkadir, Said J.
    Adeyemo, Victor E.
    Imam, Abdullahi A.
    Bajeh, Amos O.
    [J]. JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2019, 14 (06): : 3294 - 3308
  • [8] An Ensemble Oversampling Model for Class Imbalance Problem in Software Defect Prediction
    Huda, Shamsul
    Liu, Kevin
    Abdelrazek, Mohamed
    Ibrahim, Amani
    Alyahya, Sultan
    Al-Dossari, Hmood
    Ahmad, Shafiq
    [J]. IEEE ACCESS, 2018, 6 : 24184 - 24195
  • [9] A Survey of Different Approaches for the Class Imbalance Problem in Software Defect Prediction
    Dar, Abdul Waheed
    Farooq, Sheikh Umar
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2022, 14 (01):
  • [10] Class Imbalance Issue in Software Defect Prediction Models by various Machine Learning Techniques: An Empirical Study
    Pandey, Sushant Kumar
    Tripathi, Anil Kumar
    [J]. 2021 8TH INTERNATIONAL CONFERENCE ON SMART COMPUTING AND COMMUNICATIONS (ICSCC), 2021, : 58 - 63