MASTER: Multi-Source Transfer Weighted Ensemble Learning for Multiple Sources Cross-Project Defect Prediction

被引：0

作者：

Tong, Haonan ^{[1
]}

Zhang, Dalin ^{[1
]}

Liu, Jiqiang ^{[1
]}

Xing, Weiwei ^{[1
]}

Lu, Lingyun ^{[1
]}

Lu, Wei ^{[1
]}

Wu, Yumei ^{[2
]}

机构：

[1] Beijing Jiaotong Univ, Sch Software Engn, Beijing 100044, Peoples R China

[2] Beihang Univ, Sch Reliabil & Syst Engn, Beijing 100191, Peoples R China

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2024年 / 50卷 / 05期

关键词：

Multiple source datasets; cross-project defect prediction; software defect proneness; feature weighting; transfer learning; MODEL;

D O I：

10.1109/TSE.2024.3381235

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Multi-source cross-project defect prediction (MSCPDP) attempts to transfer defect knowledge learned from multiple source projects to the target project. MSCPDP has drawn increasing attention from academic and industry communities owing to its advantages compared with single-source cross-project defect prediction (SSCPDP). However, two main problems, which are how to effectively extract the transferable knowledge from each source dataset and how to measure the amount of knowledge transferred from each source dataset to the target dataset, seriously restrict the performance of existing MSCPDP models. In this paper, we propose a novel <bold>m</bold>ulti-source tr<bold>a</bold>n<bold>s</bold>fer weigh<bold>t</bold>ed <bold>e</bold>nsemble lea<bold>r</bold>ning (MASTER) method for MSCPDP. MASTER measures the weight of each source dataset based on feature importance and distribution difference and then extracts the transferable knowledge based on the proposed feature-weighted transfer learning algorithm. Experiments are performed on 30 software projects. We compare MASTER with the latest state-of-the-art MSCPDP methods with statistical test in terms of famous effort-unaware measures (i.e., PD, PF, AUC, and MCC) and two widely used effort-aware measures (P-opt 20% and IFA). The experiment results show that: 1) MASTER can substantially improve the prediction performance compared with the baselines, e.g., an improvement of at least 49.1% in MCC, 48.1% in IFA; 2) MASTER significantly outperforms each baseline on most datasets in terms of AUC, MCC, P-opt 20% and IFA; 3) MSCPDP model significantly performs better than the mean case of SSCPDP model on most datasets and even outperforms the best case of SSCPDP on some datasets. It can be concluded that 1) it is very necessary to conduct MSCPDP, and 2) the proposed MASTER is a more promising alternative for MSCPDP.

引用

页码：1281 / 1305

页数：25

共 50 条

[1] MHCPDP: multi-source heterogeneous cross-project defect prediction via multi-source transfer learning and autoencoder
Jie Wu
Yingbo Wu
Nan Niu
Min Zhou
[J]. Software Quality Journal, 2021, 29 : 405 - 430
[2] MHCPDP: multi-source heterogeneous cross-project defect prediction via multi-source transfer learning and autoencoder
Wu, Jie
Wu, Yingbo
Niu, Nan
Zhou, Min
[J]. SOFTWARE QUALITY JOURNAL, 2021, 29 (02) : 405 - 430
[3] MSCPDPLab: A MATLAB toolbox for transfer learning based multi-source cross-project defect prediction
Zou, Jiaqi
Li, Zonghao
Liu, Xuanying
Tong, Haonan
[J]. SOFTWAREX, 2023, 21
[4] MSCPDPLab: A MATLAB toolbox for transfer learning based multi-source cross-project defect prediction
Zou, Jiaqi
Li, Zonghao
Liu, Xuanying
Tong, Haonan
[J]. SOFTWAREX, 2023, 21
[5] A three-stage transfer learning framework for multi-source cross-project software defect prediction
Bai, Jiaojiao
Jia, Jingdong
Capretz, Luiz Fernando
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2022, 150
[6] A three-stage transfer learning framework for multi-source cross-project software defect prediction
Bai, Jiaojiao
Jia, Jingdong
Capretz, Luiz Fernando
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2022, 150
[7] A three-stage transfer learning framework for multi-source cross-project software defect prediction
Bai, Jiaojiao
Jia, Jingdong
Capretz, Luiz Fernando
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2022, 150
[8] An Empirical Study on Multi-Source Cross-Project Defect Prediction Models
Liu, Xuanying
Li, Zonghao
Zou, Jiaqi
Tong, Haonan
[J]. 2022 29TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC, 2022, : 318 - 327
[9] Dissimilarity Space Based Multi-Source Cross-Project Defect Prediction
Ren, Shengbing
Zhang, Wanying
Munir, Hafiz Shahbaz
Xia, Lei
[J]. ALGORITHMS, 2019, 12 (01)
[10] Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning
Yin, Xinglong
Liu, Lei
Liu, Huaxiao
Wu, Qi
[J]. MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2020, 17 (02) : 1020 - 1040

← 1 2 3 4 5 →