On the Value of Oversampling for Deep Learning in Software Defect Prediction

被引:27
|
作者
Yedida, Rahul [1 ]
Menzies, Tim [1 ]
机构
[1] North Carolina State Univ, Dept Comp Sci, Raleigh, NC 27695 USA
基金
美国国家科学基金会;
关键词
Deep learning; Tuning; Predictive models; Standards; Prediction algorithms; Training; Tools; Defect prediction; oversampling; class imbalance; neural networks; METRICS SUITE;
D O I
10.1109/TSE.2021.3079841
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
One truism of deep learning is that the automatic feature engineering (seen in the first layers of those networks) excuses data scientists from performing tedious manual feature engineering prior to running DL. For the specific case of deep learning for defect prediction, we show that that truism is false. Specifically, when we pre-process data with a novel oversampling technique called fuzzy sampling, as part of a larger pipeline called GHOST (Goal-oriented Hyper-parameter Optimization for Scalable Training), then we can do significantly better than the prior DL state of the art in 14/20 defect data sets. Our approach yields state-of-the-art results significantly faster deep learners. These results present a cogent case for the use of oversampling prior to applying deep learning on software defect prediction datasets.
引用
收藏
页码:3103 / 3116
页数:14
相关论文
共 50 条
  • [1] On the use of deep learning in software defect prediction
    Giray, Gorkem
    Bennin, Kwabena Ebo
    Koksal, Omer
    Babur, Onder
    Tekinerdogan, Bedir
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 195
  • [2] Deep learning based software defect prediction
    Qiao, Lei
    Li, Xuesong
    Umer, Qasim
    Guo, Ping
    [J]. NEUROCOMPUTING, 2020, 385 : 100 - 110
  • [3] Software Defect Prediction using Deep Learning
    Nevendra, Meetesh
    Singh, Pradeep
    [J]. ACTA POLYTECHNICA HUNGARICA, 2021, 18 (10) : 173 - 189
  • [4] Deep Learning for Software Defect Prediction in time
    Yadav, Monika
    Singh, Vijendra
    Rastogi, Priyanka
    [J]. 2018 FIFTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (IEEE PDGC), 2018, : 7 - 12
  • [5] Instance gravity oversampling method for software defect prediction
    Tang, Yu
    Zhou, Yang
    Yang, Cheng
    Du, Ye
    Yang, Ming-song
    [J]. Information and Software Technology, 2025, 179
  • [6] Is deep learning good enough for software defect prediction?
    Pandey, Sushant Kumar
    Haldar, Arya
    Tripathi, Anil Kumar
    [J]. INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2023,
  • [7] Performing Software Defect Prediction Using Deep Learning
    Gurung, Saksham
    [J]. Communications in Computer and Information Science, 2022, 1697 CCIS : 319 - 331
  • [8] A Survey on Software Defect Prediction Using Deep Learning
    Akimova, Elena N.
    Bersenev, Alexander Yu
    Deikov, Artem A.
    Kobylkin, Konstantin S.
    Konygin, Anton, V
    Mezentsev, Ilya P.
    Misilov, Vladimir E.
    [J]. MATHEMATICS, 2021, 9 (11)
  • [9] A Survey of Software Defect Prediction Based on Deep Learning
    Nevendra, Meetesh
    Singh, Pradeep
    [J]. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2022, 29 (07) : 5723 - 5748
  • [10] A Survey of Software Defect Prediction Based on Deep Learning
    Meetesh Nevendra
    Pradeep Singh
    [J]. Archives of Computational Methods in Engineering, 2022, 29 : 5723 - 5748