Using Coding-Based Ensemble Learning to Improve Software Defect Prediction

被引:127
|
作者
Sun, Zhongbin [1 ]
Song, Qinbao [1 ]
Zhu, Xiaoyan [1 ]
机构
[1] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian 710049, Peoples R China
基金
中国国家自然科学基金;
关键词
Class-imbalance data; meta learning; multiclassifier; software defect prediction; STATIC CODE ATTRIBUTES; NEURAL-NETWORKS; CLASSIFICATION; TREES;
D O I
10.1109/TSMCC.2012.2226152
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Using classification methods to predict software defect proneness with static code attributes has attracted a great deal of attention. The class-imbalance characteristic of software defect data makes the prediction much difficult; thus, a number of methods have been employed to address this problem. However, these conventional methods, such as sampling, cost-sensitive learning, Bagging, and Boosting, could suffer from the loss of important information, unexpected mistakes, and overfitting because they alter the original data distribution. This paper presents a novel method that first converts the imbalanced binary-class data into balanced multiclass data and then builds a defect predictor on the multiclass data with a specific coding scheme. A thorough experiment with four different types of classification algorithms, three data coding schemes, and six conventional imbalance data-handling methods was conducted over the 14 NASA datasets. The experimental results show that the proposed method with a one-against-one coding scheme is averagely superior to the conventional methods.
引用
收藏
页码:1806 / 1817
页数:12
相关论文
共 50 条
  • [1] Ensemble learning based software defect prediction
    Dong, Xin
    Liang, Yan
    Miyamoto, Shoichiro
    Yamaguchi, Shingo
    [J]. JOURNAL OF ENGINEERING RESEARCH, 2023, 11 (04): : 377 - 391
  • [2] Software defect prediction using ensemble learning on selected features
    Laradji, Issam H.
    Alshayeb, Mohammad
    Ghouti, Lahouari
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2015, 58 : 388 - 402
  • [3] Handling Imbalanced Data using Ensemble Learning in Software Defect Prediction
    Malhotra, Ruchika
    Jain, Juhi
    [J]. PROCEEDINGS OF THE CONFLUENCE 2020: 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING, 2020, : 300 - 304
  • [4] Software Defect Prediction Using Ensemble Learning: A Systematic Literature Review
    Matloob, Faseeha
    Ghazal, Taher M.
    Taleb, Nasser
    Aftab, Shabib
    Ahmad, Munir
    Khan, Muhammad Adnan
    Abbas, Sagheer
    Soomro, Tariq Rahim
    [J]. IEEE ACCESS, 2021, 9 : 98754 - 98771
  • [5] Multiple kernel ensemble learning for software defect prediction
    Tiejian Wang
    Zhiwu Zhang
    Xiaoyuan Jing
    Liqiang Zhang
    [J]. Automated Software Engineering, 2016, 23 : 569 - 590
  • [6] Multiple kernel ensemble learning for software defect prediction
    Wang, Tiejian
    Zhang, Zhiwu
    Jing, Xiaoyuan
    Zhang, Liqiang
    [J]. AUTOMATED SOFTWARE ENGINEERING, 2016, 23 (04) : 569 - 590
  • [7] Software Defect Prediction Based Ensemble Approach
    Harikiran, J.
    Chandana, B. Sai
    Srinivasarao, B.
    Raviteja, B.
    Reddy, Tatireddy Subba
    [J]. Computer Systems Science and Engineering, 2023, 45 (03): : 2313 - 2331
  • [8] Software Defect Prediction Using an Intelligent Ensemble-Based Model
    Ali, Misbah
    Mazhar, Tehseen
    Arif, Yasir
    Al-Otaibi, Shaha
    Ghadi, Yazeed Yasin
    Shahzad, Tariq
    Khan, Muhammad Amir
    Hamam, Habib
    [J]. IEEE ACCESS, 2024, 12 : 20376 - 20395
  • [9] Software Defect Prediction and Localization with Attention-Based Models and Ensemble Learning
    Zhang, Tianhang
    Du, Qingfeng
    Xu, Jincheng
    Li, Jiechu
    Li, Xiaojun
    [J]. 2020 27TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2020), 2020, : 81 - 90