Using Coding-Based Ensemble Learning to Improve Software Defect Prediction

被引:127
|
作者
Sun, Zhongbin [1 ]
Song, Qinbao [1 ]
Zhu, Xiaoyan [1 ]
机构
[1] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian 710049, Peoples R China
基金
中国国家自然科学基金;
关键词
Class-imbalance data; meta learning; multiclassifier; software defect prediction; STATIC CODE ATTRIBUTES; NEURAL-NETWORKS; CLASSIFICATION; TREES;
D O I
10.1109/TSMCC.2012.2226152
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Using classification methods to predict software defect proneness with static code attributes has attracted a great deal of attention. The class-imbalance characteristic of software defect data makes the prediction much difficult; thus, a number of methods have been employed to address this problem. However, these conventional methods, such as sampling, cost-sensitive learning, Bagging, and Boosting, could suffer from the loss of important information, unexpected mistakes, and overfitting because they alter the original data distribution. This paper presents a novel method that first converts the imbalanced binary-class data into balanced multiclass data and then builds a defect predictor on the multiclass data with a specific coding scheme. A thorough experiment with four different types of classification algorithms, three data coding schemes, and six conventional imbalance data-handling methods was conducted over the 14 NASA datasets. The experimental results show that the proposed method with a one-against-one coding scheme is averagely superior to the conventional methods.
引用
收藏
页码:1806 / 1817
页数:12
相关论文
共 50 条
  • [21] Software defect prediction using stacked denoising autoencoders and two-stage ensemble learning
    Tong, Haonan
    Liu, Bin
    Wang, Shihai
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2018, 96 : 94 - 111
  • [22] Software Defect Prediction Based on Fourier Learning
    Yang, Kang
    Yu, Huiqun
    Fan, Guisheng
    Yang, Xingguang
    Zheng, Song
    Leng, Chunxia
    [J]. PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), 2018, : 388 - 392
  • [23] Deep learning based software defect prediction
    Qiao, Lei
    Li, Xuesong
    Umer, Qasim
    Guo, Ping
    [J]. NEUROCOMPUTING, 2020, 385 : 100 - 110
  • [24] Software defect prediction ensemble learning algorithm based on adaptive variable sparrow search algorithm
    Tang, Yu
    Dai, Qi
    Yang, Mengyuan
    Du, Tony
    Chen, Lifang
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (06) : 1967 - 1987
  • [25] Software defect prediction ensemble learning algorithm based on adaptive variable sparrow search algorithm
    Yu Tang
    Qi Dai
    Mengyuan Yang
    Tony Du
    Lifang Chen
    [J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 1967 - 1987
  • [26] Dictionary Learning Based Software Defect Prediction
    Jing, Xiao-Yuan
    Ying, Shi
    Zhang, Zhi-Wu
    Wu, Shan-Shan
    Liu, Jin
    [J]. 36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014), 2014, : 414 - 423
  • [27] Heterogeneous Defect Prediction Using Ensemble Learning Technique
    Ansari, Arsalan Ahmed
    Iqbal, Amaan
    Sahoo, Bibhudatta
    [J]. ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, 2020, 1056 : 283 - 293
  • [28] Bootstrap aggregation ensemble learning-based reliable approach for software defect prediction by using characterized code feature
    Suresh Kumar, P.
    Behera, H. S.
    Nayak, Janmenjoy
    Naik, Bighnaraj
    [J]. INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2021, 17 (04) : 355 - 379
  • [29] Bootstrap aggregation ensemble learning-based reliable approach for software defect prediction by using characterized code feature
    P. Suresh Kumar
    H. S. Behera
    Janmenjoy Nayak
    Bighnaraj Naik
    [J]. Innovations in Systems and Software Engineering, 2021, 17 : 355 - 379
  • [30] Sparse Coding-based Intra Prediction in VVC
    Schneider, Jens
    Mehlem, Dominik
    Meyer, Maria
    Rohlfing, Christian
    [J]. 2021 PICTURE CODING SYMPOSIUM (PCS), 2021, : 206 - 210