MHCPDP: multi-source heterogeneous cross-project defect prediction via multi-source transfer learning and autoencoder

被引:19
|
作者
Wu, Jie [1 ]
Wu, Yingbo [1 ]
Niu, Nan [2 ]
Zhou, Min [1 ]
机构
[1] Chongqing Univ, Sch Software Engn, Chongqing, Peoples R China
[2] Univ Cincinnati, Dept Elect Engn & Comp Sci, Cincinnati, OH USA
关键词
Autoencoder; Heterogeneous cross-project defect prediction; Multi-source transfer learning; Modified autoencoder; CANONICAL CORRELATION-ANALYSIS;
D O I
10.1007/s11219-021-09553-2
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Heterogeneous cross-project defect prediction (HCPDP) is aimed at building a defect prediction model for the target project by reusing datasets from source projects, where the source project datasets and target project dataset have different features. Most existing HCPDP methods only remove redundant or unrelated features without exploring the underlying features of cross-project datasets. Additionally, when the transfer learning method is used in HCPDP, these methods ignore the negative effect of transfer learning. In this paper, we propose a novel HCPDP method called multi-source heterogeneous cross-project defect prediction (MHCPDP). To reduce the gap between the target datasets and the source datasets, MHCPDP uses the autoencoder to extract the intermediate features from the original datasets instead of simply removing redundant and unrelated features and adopts a modified autoencoder algorithm to make instance selection for eliminating irrelevant instances from the source domain datasets. Furthermore, by incorporating multiple source projects to increase the number of source datasets, MHCPDP develops a multi-source transfer learning algorithm to reduce the impact of negative transfers and upgrade the performance of the classifier. We comprehensively evaluate MHCPDP on five open source datasets; our experimental results show that MHCPDP not only has significant improvement in two performance metrics but also overcomes the shortcomings of the conventional HCPDP methods.
引用
收藏
页码:405 / 430
页数:26
相关论文
共 50 条
  • [1] MHCPDP: multi-source heterogeneous cross-project defect prediction via multi-source transfer learning and autoencoder
    Jie Wu
    Yingbo Wu 
    Nan Niu
    Min Zhou
    [J]. Software Quality Journal, 2021, 29 : 405 - 430
  • [2] MSCPDPLab: A MATLAB toolbox for transfer learning based multi-source cross-project defect prediction
    Zou, Jiaqi
    Li, Zonghao
    Liu, Xuanying
    Tong, Haonan
    [J]. SOFTWAREX, 2023, 21
  • [3] MSCPDPLab: A MATLAB toolbox for transfer learning based multi-source cross-project defect prediction
    Zou, Jiaqi
    Li, Zonghao
    Liu, Xuanying
    Tong, Haonan
    [J]. SOFTWAREX, 2023, 21
  • [4] An Empirical Study on Multi-Source Cross-Project Defect Prediction Models
    Liu, Xuanying
    Li, Zonghao
    Zou, Jiaqi
    Tong, Haonan
    [J]. 2022 29TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC, 2022, : 318 - 327
  • [5] Dissimilarity Space Based Multi-Source Cross-Project Defect Prediction
    Ren, Shengbing
    Zhang, Wanying
    Munir, Hafiz Shahbaz
    Xia, Lei
    [J]. ALGORITHMS, 2019, 12 (01)
  • [6] A three-stage transfer learning framework for multi-source cross-project software defect prediction
    Bai, Jiaojiao
    Jia, Jingdong
    Capretz, Luiz Fernando
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2022, 150
  • [7] A three-stage transfer learning framework for multi-source cross-project software defect prediction
    Bai, Jiaojiao
    Jia, Jingdong
    Capretz, Luiz Fernando
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2022, 150
  • [8] MASTER: Multi-Source Transfer Weighted Ensemble Learning for Multiple Sources Cross-Project Defect Prediction
    Tong, Haonan
    Zhang, Dalin
    Liu, Jiqiang
    Xing, Weiwei
    Lu, Lingyun
    Lu, Wei
    Wu, Yumei
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2024, 50 (05) : 1281 - 1305
  • [9] A three-stage transfer learning framework for multi-source cross-project software defect prediction
    Bai, Jiaojiao
    Jia, Jingdong
    Capretz, Luiz Fernando
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2022, 150
  • [10] Cross-project software defect prediction based on multi-source data sets
    Huang Junfu
    Wang Yawen
    Gong Yunzhan
    Jin Dahai
    [J]. The Journal of China Universities of Posts and Telecommunications, 2021, 28 (04) : 75 - 87