A Hybrid Approach to Coping with High Dimensionality and Class Imbalance for Software Defect Prediction

被引:9
|
作者
Gao, Kehan [1 ]
Khoshgoftaar, Taghi [2 ]
Napolitano, Amri [2 ]
机构
[1] Eastern Connecticut State Univ, Willimantic, CT 06226 USA
[2] Florida Atlantic Univ, Boca Raton, FL 33431 USA
关键词
FEATURE RANKING TECHNIQUES; CLASSIFICATION;
D O I
10.1109/ICMLA.2012.145
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High dimensionality and class imbalance are the two main problems affecting many software defect prediction. In this paper, we propose a new technique, named SelectRUSBoost, which is a form of ensemble learning that incorporates data sampling to alleviate class imbalance and feature selection to resolve high dimensionality. To evaluate the effectiveness of the new technique, we apply it to a group of datasets in the context of software defect prediction. We employ two classification learners and six feature selection techniques. We compare the technique to the approach where feature selection and data sampling are used together, as well as the case where feature selection is used alone (no sampling used at all). The experimental results demonstrate that the SelectRUSBoost technique is more effective in improving classification performance compared to the other approaches.
引用
收藏
页码:281 / 288
页数:8
相关论文
共 50 条
  • [1] Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance
    Bejjanki, Kiran Kumar
    Gyani, Jayadev
    Gugulothu, Narsimha
    [J]. SYMMETRY-BASEL, 2020, 12 (03):
  • [2] A Software Defect Prediction Approach Based on Hybrid Feature Dimensionality Reduction
    Zhang, Shenggang
    Jiang, Shujuan
    Yan, Yue
    [J]. Scientific Programming, 2023, 2023
  • [3] Using Class Imbalance Learning for Software Defect Prediction
    Wang, Shuo
    Yao, Xin
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2013, 62 (02) : 434 - 443
  • [4] Class Imbalance Data-Generation for Software Defect Prediction
    Li, Zheng
    Zhang, Xingyao
    Guo, Junxia
    Shang, Ying
    [J]. 2019 26TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC), 2019, : 276 - 283
  • [5] Tackling class overlap and imbalance problems in software defect prediction
    Lin Chen
    Bin Fang
    Zhaowei Shang
    Yuanyan Tang
    [J]. Software Quality Journal, 2018, 26 : 97 - 125
  • [6] Tackling class overlap and imbalance problems in software defect prediction
    Chen, Lin
    Fang, Bin
    Shang, Zhaowei
    Tang, Yuanyan
    [J]. SOFTWARE QUALITY JOURNAL, 2018, 26 (01) : 97 - 125
  • [7] SOFTWARE DEFECT PREDICTION: ANALYSIS OF CLASS IMBALANCE AND PERFORMANCE STABILITY
    Balogun, Abdullateef O.
    Basri, Shuib
    Abdulkadir, Said J.
    Adeyemo, Victor E.
    Imam, Abdullahi A.
    Bajeh, Amos O.
    [J]. JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2019, 14 (06): : 3294 - 3308
  • [8] MAHAKIL: Diversity Based Oversampling Approach to Alleviate the Class Imbalance Issue in Software Defect Prediction
    Benni, Kwabena Ebo
    Keung, Jacky
    Phannachitta, Passakorn
    Monden, Akito
    Mensah, Solomon
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2018, 44 (06) : 534 - 550
  • [9] An Ensemble Oversampling Model for Class Imbalance Problem in Software Defect Prediction
    Huda, Shamsul
    Liu, Kevin
    Abdelrazek, Mohamed
    Ibrahim, Amani
    Alyahya, Sultan
    Al-Dossari, Hmood
    Ahmad, Shafiq
    [J]. IEEE ACCESS, 2018, 6 : 24184 - 24195
  • [10] A Survey of Different Approaches for the Class Imbalance Problem in Software Defect Prediction
    Dar, Abdul Waheed
    Farooq, Sheikh Umar
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2022, 14 (01):