Improve cross-project just-in-time defect prediction with dynamic transfer learning

被引:0
|
作者
Dai, Hongming [1 ,2 ]
Xi, Jianqing [1 ]
Dai, Hong-Liang [3 ]
机构
[1] School of Software, South China University of Technology, Guangzhou,510006, China
[2] School of Information, Guangdong Polytechnic of Science and Trade, Guangzhou,510430, China
[3] School of Economics and Statistics, Guangzhou University, Guangzhou,510006, China
关键词
Prediction models;
D O I
10.1016/j.jss.2024.112214
中图分类号
学科分类号
摘要
Cross-project just-in-time software defect prediction (CP-JIT-SDP) is a prominent research topic in the field of software engineering. This approach is characterized by its immediacy, accuracy, real-time feedback, and traceability, enabling it to effectively address the challenges of defect prediction in new projects or projects with limited training data. However, CP-JIT-SDP faces significant challenges due to the differences in the feature distribution between the source and target projects. To address this issue, researchers have proposed methods for adjusting marginal or conditional probability distributions. This study introduces a transfer-learning approach that integrates dynamic distribution adaptation. The kernel variance matching (KVM) method is proposed to adjust the disparity in the marginal probability distribution by recalculating the variance of the source and target projects within the reproducing kernel Hilbert space (RKHS) to minimize the variance disparity. The categorical boosting (CatBoost) algorithm is used to construct models, while the improved CORrelation ALignment (CORAL) method is applied to develop the loss function to address the difference in the conditional probability distribution. This method is abbreviated as KCC, where the symbol K represents KVM, the symbol C represents CatBoost, and the next symbol C represents improved CORAL. The KCC method aims to optimize the joint probability distribution of the source project so that it closely agrees with that of the target project through iterative and dynamic integration. Six well-known open-source projects were used to evaluate the effectiveness of the proposed method. The empirical findings indicate that the KCC method exhibited significant improvements over the baseline methods. In particular, the KCC method demonstrated an average increase of 18% in the geometric mean (G-mean), 105.4% in the Matthews correlation coefficient (MCC), 25.6% in the F1-score, and 16.9% in the area under the receiver operating characteristic curve (AUC) when compared to the baseline methods. Furthermore, the KCC method demonstrated greater stability. © 2024 Elsevier Inc.
引用
收藏
相关论文
共 50 条
  • [31] Cross-Project Dynamic Defect Prediction Model for Crowdsourced test
    Yao, Yi
    Liu, Yuchan
    Huang, Song
    Chen, Hao
    Liu, Jialuo
    Yang, Fan
    [J]. 2020 IEEE 20TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY, AND SECURITY (QRS 2020), 2020, : 223 - 230
  • [32] Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning
    Yin, Xinglong
    Liu, Lei
    Liu, Huaxiao
    Wu, Qi
    [J]. MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2020, 17 (02) : 1020 - 1040
  • [33] DeepCPDP: Deep Learning Based Cross-Project Defect Prediction
    Chen, Deyu
    Chen, Xiang
    Li, Hao
    Xie, Junfeng
    Mu, Yanzhou
    [J]. IEEE ACCESS, 2019, 7 : 184832 - 184848
  • [34] Cross-project Defect Prediction Method Using Adversarial Learning
    Xing Y.
    Qian X.-M.
    Guan Y.
    Zhang S.-H.
    Zhao M.-C.
    Lin W.-T.
    [J]. Ruan Jian Xue Bao/Journal of Software, 2022, 33 (06): : 2097 - 2112
  • [35] Cross-project clone consistent-defect prediction via transfer-learning method
    Jiang, Wenchao
    Qiu, Shaojian
    Liang, Tiancai
    Zhang, Fanlong
    [J]. INFORMATION SCIENCES, 2023, 635 : 138 - 150
  • [36] Cross-project bug type prediction based on transfer learning
    Xiaoting Du
    Zenghui Zhou
    Beibei Yin
    Guanping Xiao
    [J]. Software Quality Journal, 2020, 28 : 39 - 57
  • [37] Understanding the Automated Parameter Optimization on Transfer Learning for Cross-Project Defect Prediction: An Empirical Study
    Li, Ke
    Xiang, Zilin
    Chen, Tao
    Wang, Shuo
    Tan, Kay Chen
    [J]. 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 566 - 577
  • [38] A Framework for Homogeneous Cross-Project Defect Prediction
    Goel, Lipika
    Sharma, Mayank
    Khatri, Sunil Kumar
    Damodaran, D.
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2021, 9 (01) : 52 - 68
  • [39] A Replication Study: Just-In-Time Defect Prediction with Ensemble Learning
    Young, Steven
    Abdou, Tamer
    Bener, Ayse
    [J]. 2018 IEEE/ACM 6TH INTERNATIONAL WORKSHOP ON REALIZING ARTIFICIAL INTELLIGENCE SYNERGIES IN SOFTWARE ENGINEERING (RAISE), 2018, : 42 - 47
  • [40] Assessing the Effect of Imbalanced Learning on Cross-project Software Defect Prediction
    Sohan, Md Fahimuzzman
    Jabiullah, Md Ismail
    Rahman, Sheikh Shah Mohammad Motiur
    Mahmud, S. M. Hasan
    [J]. 2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,