Class Imbalance Data-Generation for Software Defect Prediction

被引:3
|
作者
Li, Zheng [1 ]
Zhang, Xingyao [1 ]
Guo, Junxia [1 ]
Shang, Ying [1 ]
机构
[1] Beijing Univ Chem Technol, Dept Comp Sci, Beijing 100029, Peoples R China
基金
中国国家自然科学基金;
关键词
software defect prediction; imbalanced data; data generation; machine learning;
D O I
10.1109/APSEC48747.2019.00045
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The imbalanced nature of class in software defect data, which including intra-class imbalance and inter-classes imbalance, increases the difficulty of learning an effective defect prediction model. Most of sampling and example generation approaches just focused on inter-class imbalanced defect data, and they are not effective to handle the issue of intra-class imbalance. This paper proposed a distribution based data generation approach for software defect prediction to deal with inter-class and intra-class imbalanced data simultaneously. First, the classified sub-regions are clustered according to the distribution in the sample feature space. Second, the data are generated by corresponding strategies according to different distribution in sub-regions, where the inter-class balance is achieved by increasing the number of defective samples, and the intra-class balance is achieved by generating different density of data in different sub-regions. Experiment results show that the proposed method can reduce the impact of data imbalance on defect prediction and improve the accuracy of software defect prediction model effectively by generating inter-class and intra-class balanced defects data.
引用
收藏
页码:276 / 283
页数:8
相关论文
共 50 条
  • [1] Using Class Imbalance Learning for Software Defect Prediction
    Wang, Shuo
    Yao, Xin
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2013, 62 (02) : 434 - 443
  • [2] Tackling class overlap and imbalance problems in software defect prediction
    Lin Chen
    Bin Fang
    Zhaowei Shang
    Yuanyan Tang
    [J]. Software Quality Journal, 2018, 26 : 97 - 125
  • [3] Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance
    Bejjanki, Kiran Kumar
    Gyani, Jayadev
    Gugulothu, Narsimha
    [J]. SYMMETRY-BASEL, 2020, 12 (03):
  • [4] Tackling class overlap and imbalance problems in software defect prediction
    Chen, Lin
    Fang, Bin
    Shang, Zhaowei
    Tang, Yuanyan
    [J]. SOFTWARE QUALITY JOURNAL, 2018, 26 (01) : 97 - 125
  • [5] SOFTWARE DEFECT PREDICTION: ANALYSIS OF CLASS IMBALANCE AND PERFORMANCE STABILITY
    Balogun, Abdullateef O.
    Basri, Shuib
    Abdulkadir, Said J.
    Adeyemo, Victor E.
    Imam, Abdullahi A.
    Bajeh, Amos O.
    [J]. JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2019, 14 (06): : 3294 - 3308
  • [6] An Empirical Study on Data Sampling Methods in Addressing Class Imbalance Problem in Software Defect Prediction
    Odejide, Babajide J.
    Bajeh, Amos O.
    Balogun, Abdullateef O.
    Alanamu, Zubair O.
    Adewole, Kayode S.
    Akintola, Abimbola G.
    Salihu, Shakirat A.
    Usman-Hamza, Fatima E.
    Mojeed, Hammed A.
    [J]. SOFTWARE ENGINEERING PERSPECTIVES IN SYSTEMS, VOL. 1, 2022, 501 : 594 - 610
  • [7] Class Imbalance in Software Fault Prediction Data Set
    Arun, C.
    Lakshmi, C.
    [J]. ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, 2020, 1056 : 745 - 757
  • [8] An Ensemble Oversampling Model for Class Imbalance Problem in Software Defect Prediction
    Huda, Shamsul
    Liu, Kevin
    Abdelrazek, Mohamed
    Ibrahim, Amani
    Alyahya, Sultan
    Al-Dossari, Hmood
    Ahmad, Shafiq
    [J]. IEEE ACCESS, 2018, 6 : 24184 - 24195
  • [9] A Survey of Different Approaches for the Class Imbalance Problem in Software Defect Prediction
    Dar, Abdul Waheed
    Farooq, Sheikh Umar
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2022, 14 (01):
  • [10] Class Imbalance Learning to Heterogeneous Cross-Software Projects Defect Prediction
    Vashisht, Rohit
    Rizvi, Syed Afzal Murtaza
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2022, 10 (01)