Imbalanced Data Processing Model for Software Defect Prediction

被引:0
|
作者
Lijuan Zhou
Ran Li
Shudong Zhang
Hua Wang
机构
[1] Capital Normal University,College of Information Engineering
来源
关键词
Software defect prediction; Class imbalance; Attribute selection; Sampling; Ensemble algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
In the field of software engineering, software defect prediction is the hotspot of the researches which can effectively guarantee the quality during software development. However, the problem of class imbalanced datasets will affect the accuracy of overall classification of software defect prediction, which is the key issue to be solved urgently today. In order to better solve this problem, this paper proposes a model named ASRA which combines attribute selection, sampling technologies and ensemble algorithm. The model adopts the Chi square test of attribute selection and then utilizes the combined sampling technique which includes SMOTE over-sampling and under-sampling to remove the redundant attributes and make the datasets balance. Afterwards, the model ASRA is eventually established by ensemble algorithm named Adaboost with basic classifier J48 decision tree. The data used in the experiments comes from UCI datasets. It can draw the conclusion that the effect of software defect prediction classification which using this model is improved and better than before by comparing the precision P, F-measure and AUC values from the results of the experiments.
引用
收藏
页码:937 / 950
页数:13
相关论文
共 50 条
  • [1] Imbalanced Data Processing Model for Software Defect Prediction
    Zhou, Lijuan
    Li, Ran
    Zhang, Shudong
    Wang, Hua
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2018, 102 (02) : 937 - 950
  • [2] Feature Selection with Imbalanced Data for Software Defect Prediction
    Khoshgoftaar, Taghi M.
    Gao, Kehan
    [J]. EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 235 - +
  • [3] Attribute Selection and Imbalanced Data: Problems in Software Defect Prediction
    Khoshgoftaar, Taghi M.
    Gao, Kehan
    Seliya, Naeem
    [J]. 22ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2010), PROCEEDINGS, VOL 1, 2010,
  • [4] Handling Imbalanced Data using Ensemble Learning in Software Defect Prediction
    Malhotra, Ruchika
    Jain, Juhi
    [J]. PROCEEDINGS OF THE CONFLUENCE 2020: 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING, 2020, : 300 - 304
  • [5] A New Software Fault Prediction Model in Imbalanced Data
    Wang, Shi-Hai
    He, Ping
    [J]. 2015 INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND INFORMATION SYSTEM (SEIS 2015), 2015, : 245 - 250
  • [6] Efficiency of oversampling methods for enhancing software defect prediction by using imbalanced data
    Benala, Tirimula Rao
    Tantati, Karunya
    [J]. INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2023, 19 (03) : 247 - 263
  • [7] Ensemble MultiBoost Based on RIPPER Classifier for Prediction of Imbalanced Software Defect Data
    He, Haitao
    Zhang, Xu
    Wang, Qian
    Ren, Jiadong
    Liu, Jiaxin
    Zhao, Xiaolin
    Cheng, Yongqiang
    [J]. IEEE ACCESS, 2019, 7 : 110333 - 110343
  • [8] Efficiency of oversampling methods for enhancing software defect prediction by using imbalanced data
    Tirimula Rao Benala
    Karunya Tantati
    [J]. Innovations in Systems and Software Engineering, 2023, 19 : 247 - 263
  • [9] Applying Weighted Particle Swarm Optimization to Imbalanced Data in Software Defect Prediction
    Brezocnik, Lucija
    Podgorelec, Vili
    [J]. NEW TECHNOLOGIES, DEVELOPMENT AND APPLICATION, 2019, 42 : 289 - 296
  • [10] Online Defect Prediction for Imbalanced Data
    Tan, Ming
    Tan, Lin
    Dara, Sashank
    Mayeux, Caleb
    [J]. 2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 2, 2015, : 99 - 108