Software Defect Prediction Based on Cost-Sensitive Dictionary Learning

被引:8
|
作者
Wan, Hongyan [1 ]
Wu, Guoqing [1 ]
Yu, Mali [2 ]
Yuan, Mengting [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Hubei, Peoples R China
[2] Jiujiang Univ, Sch Informat Sci & Technol, Jiujiang 332005, Peoples R China
关键词
Software defect prediction; dictionary learning; cost-sensitive; bilevel optimization; sparse coding; SPARSE REPRESENTATIONS; NEURAL-NETWORKS; QUALITY;
D O I
10.1142/S0218194019500384
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software defect prediction technology has been widely used in improving the quality of software system. Most real software defect datasets tend to have fewer defective modules than defective-free modules. Highly class-imbalanced data typically make accurate predictions difficult. The imbalanced nature of software defect datasets makes the prediction model classifying a defective module as a defective-free one easily. As there exists the similarity during the different software modules, one module can be represented by the sparse representation coefficients over the pre-defined dictionary which consists of historical software defect datasets. In this study, we make use of dictionary learning method to predict software defect. We optimize the classifier parameters and the dictionary atoms iteratively, to ensure that the extracted features (sparse representation) are optimal for the trained classifier. We prove the optimal condition of the elastic net which is used to solve the sparse coding coefficients and the regularity of the elastic net solution. Due to the reason that the misclassification of defective modules generally incurs much higher cost risk than the misclassification of defective-free ones, we take the different misclassification costs into account, increasing the punishment on misclassification defective modules in the procedure of dictionary learning, making the classification inclining to classify a module as a defective one. Thus, we propose a cost-sensitive software defect prediction method using dictionary learning (CSDL). Experimental results on the 10 class-imbalance datasets of NASA show that our method is more effective than several typical state-of-the-art defect prediction methods.
引用
收藏
页码:1219 / 1243
页数:25
相关论文
共 50 条
  • [31] Cost-sensitive transfer kernel canonical correlation analysis for heterogeneous defect prediction
    Zhiqiang Li
    Xiao-Yuan Jing
    Fei Wu
    Xiaoke Zhu
    Baowen Xu
    Shi Ying
    Automated Software Engineering, 2018, 25 : 201 - 245
  • [32] A transfer cost-sensitive boosting approach for cross-project defect prediction
    Ryu, Duksan
    Jang, Jong-In
    Baik, Jongmoon
    SOFTWARE QUALITY JOURNAL, 2017, 25 (01) : 235 - 272
  • [33] A transfer cost-sensitive boosting approach for cross-project defect prediction
    Duksan Ryu
    Jong-In Jang
    Jongmoon Baik
    Software Quality Journal, 2017, 25 : 235 - 272
  • [34] Cost-sensitive transfer kernel canonical correlation analysis for heterogeneous defect prediction
    Li, Zhiqiang
    Jing, Xiao-Yuan
    Wu, Fei
    Zhu, Xiaoke
    Xu, Baowen
    Ying, Shi
    AUTOMATED SOFTWARE ENGINEERING, 2018, 25 (02) : 201 - 245
  • [35] Cost-sensitive stacking ensemble learning for company financial distress prediction
    Wang S.
    Chi G.
    Expert Systems with Applications, 2024, 255
  • [36] On the Effectiveness of Cost Sensitive Neural Networks for Software Defect Prediction
    Muthukumaran, K.
    Dasgupta, Amrita
    Abhidnya, Shirode
    Neti, Lalita Bhanu Murthy
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR 2016), 2018, 614 : 557 - 570
  • [37] Cost Sensitive Decision Forest and Voting for Software Defect Prediction
    Siers, Michael J.
    Islam, Md Zahidul
    PRICAI 2014: TRENDS IN ARTIFICIAL INTELLIGENCE, 2014, 8862 : 929 - 936
  • [38] Instance-dependent misclassification cost-sensitive learning for default prediction
    Xing, Jin
    Chi, Guotai
    Pan, Ancheng
    RESEARCH IN INTERNATIONAL BUSINESS AND FINANCE, 2024, 69
  • [39] Cost-sensitive boosting in software quality modeling
    Khoshgoftaar, TM
    7TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH ASSURANCE SYSTEMS ENGINEERING, PROCEEDINGS, 2002, : 51 - 60
  • [40] Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning
    Yang, Pei-Tse
    Wu, Wen-Shuo
    Wu, Chia-Chun
    Shih, Yi-Nuo
    Hsieh, Chung-Ho
    Hsu, Jia-Lien
    OPEN MEDICINE, 2021, 16 (01): : 754 - 768