A three-stage transfer learning framework for multi-source cross-project software defect prediction

被引:0
|
作者
Bai, Jiaojiao [1 ]
Jia, Jingdong [1 ]
Capretz, Luiz Fernando [2 ]
机构
[1] Beihang Univ, Sch Software, 37 Xueyuan Rd, Beijing 100191, Peoples R China
[2] Western Univ, Elect & Comp Engn, London, ON, Canada
关键词
Transfer learning; Cross-project defect prediction; Source selection; Multi-source utilization; 3SW-MSTL;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Transfer learning techniques have been proved to be effective in the field of Cross-project defect prediction (CPDP). However, some questions still remain. First, the conditional distribution difference between source and target projects has not been considered. Second, facing multiple source projects, most studies only rarely consider the issues of source selection and multi-source data utilization; instead, they use all available projects and merge multi-source data together to obtain one final dataset. Objective: To address these issues, in this paper, we propose a three-stage weighting framework for multi-source transfer learning (3SW-MSTL) in CPDP. In stage 1, a source selection strategy is needed to select a suitable number of source projects from all available projects. In stage 2, a transfer technique is applied to minimize marginal differences. In stage 3, a multi-source data utilization scheme that uses conditional distribution information is needed to help guide researchers in the use of multi-source transferred data. Method: First, we have designed five source selection strategies and four multi-source utilization schemes and chosen the best one to be used in stage 1 and 3 in 3SW-MSTL by comparing their influences on prediction performance. Second, to validate the performance of 3SW-MSTL, we compared it with four multi-source and six single-source CPDP methods, a baseline within-project defect prediction (WPDP) method, and two unsupervised methods on the data from 30 widely used open-source projects. Results: Through experiments, bellwether and weighted vote are separately chosen as a source selection strategy and a multi-source utilization scheme used in 3SW-MSTL. And, our results indicate that 3SW-MSTL outperforms four multi-source, six single-source CPDP methods and two unsupervised methods. And, 3SW-MSTL is comparable to the WPDP method. Conclusion: The proposed 3SW-MSTL model is more effective for considering the two issues mentioned before.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Multi-source Cross Project Defect Prediction with Joint Wasserstein Distance and Ensemble Learning
    Zou, Quanyi
    Lu, Lu
    Yang, Zhanyu
    Xu, Hao
    [J]. 2021 IEEE 32ND INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2021), 2021, : 57 - 68
  • [32] Heterogeneous Cross-Project Defect Prediction Using Encoder Networks and Transfer Learning
    Haque, Radowanul
    Ali, Aftab
    McClean, Sally
    Cleland, Ian
    Noppen, Joost
    [J]. IEEE ACCESS, 2024, 12 : 409 - 419
  • [33] Cross-Project Transfer Learning on Lightweight Code Semantic Graphs for Defect Prediction
    Fang, Dingbang
    Liu, Shaoying
    Li, Yang
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2023, 33 (07) : 1095 - 1117
  • [34] A two-phase transfer learning model for cross-project defect prediction
    Liu, Chao
    Yang, Dan
    Xia, Xin
    Yan, Meng
    Zhang, Xiaohong
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 107 : 125 - 136
  • [35] A Cross-project Defect Prediction Model Using Feature Transfer and Ensemble Learning
    Zeng, Fuping
    Lin, Wanting
    Xing, Ying
    Sun, Lu
    Yang, Bin
    [J]. TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2022, 29 (04): : 1089 - 1099
  • [36] Multi-Objective Cross-Project Defect Prediction
    Canfora, Gerardo
    De Lucia, Andrea
    Di Penta, Massimiliano
    Oliveto, Rocco
    Panichella, Annibale
    Panichella, Sebastiano
    [J]. 2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2013), 2013, : 252 - 261
  • [37] Comprehensive Feature Extraction for Cross-Project Software Defect Prediction
    Reddy, Jagan Mohan
    Muthukumaran, K.
    Shahriar, Hossain
    Clincy, Victor
    Sakib, Nazmus
    [J]. 2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 450 - 451
  • [38] Domain Adaptation Approach for Cross-project Software Defect Prediction
    Chen, Shu
    Ye, Jun-Min
    Liu, Tong
    [J]. Ruan Jian Xue Bao/Journal of Software, 2020, 31 (02): : 266 - 281
  • [39] Efficient Cross-Project Software Defect Prediction Based on Federated Meta-Learning
    Chen, Haisong
    Yang, Linlin
    Wang, Aili
    [J]. ELECTRONICS, 2024, 13 (06)
  • [40] A Top-k Learning to Rank Approach to Cross-Project Software Defect Prediction
    Wang, Feng
    Huang, Jinxiao
    Ma, Yutao
    [J]. 2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 335 - 344