Feature Representation Method for Heterogeneous Defect Prediction Based on Variational Autoencoders

被引:0
|
作者
Jia X.-Y. [1 ]
Zhang W.-Z. [1 ]
Li W.-W. [2 ]
Huang Z.-Q. [3 ]
机构
[1] School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing
[2] College of Aerospace Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing
[3] College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing
来源
Ruan Jian Xue Bao/Journal of Software | 2021年 / 32卷 / 07期
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Feature representation; Heterogeneous defect prediction; Variational autoencoders;
D O I
10.13328/j.cnki.jos.006257
中图分类号
学科分类号
摘要
Cross-project defect prediction technology can use the existing labeled defect data to predict new unlabeled data, but it needs to have the same metric features for two projects, which is difficult to be applied in actual development. Heterogeneous defect prediction can perform prediction without requiring the source and target project to have the same set of metrics and thus has attracted great interest. Existing heterogeneous defect prediction models use naive or traditional machine learning methods to learn feature representations between source and target projects, and perform prediction based on it. The feature representation learned by previous studies is weak, causing poor performance in predicting defect-prone instances. In view of the powerful feature extraction and representation capabilities of deep neural networks, this study proposes a feature representation method for heterogeneous defect prediction based on variational autoencoders. By combining the variational autoencoder and maximum mean discrepancy, this method can effectively learn the common feature representation of the source and target projects. Then, an effective defect prediction model can be trained based on it. The validity of the proposed method is verified by comparing it with traditional cross-project defect prediction methods and heterogeneous defect prediction methods on various datasets. © Copyright 2021, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:2204 / 2218
页数:14
相关论文
共 29 条
  • [11] Kingma DP, Welling M., Auto-encoding variational bayes, Proc. of the 2nd Int'l Conf. on Learning Representations, (2014)
  • [12] Jing XY, Wu F, Dong XW, Qi FM, Xu BW., Heterogeneous cross-company defect prediction by unified metric representation and CCA-based transfer learning, Proc. of the 10th Joint Meeting on Foundations of Software Engineering, pp. 496-507, (2015)
  • [13] He P, Li B, Ma Y., Towards cross-project defect prediction with imbalanced feature sets, (2014)
  • [14] Cheng M, Wu GQ, Jiang M, Wan HY, You G, Yuan MT., Heterogeneous defect prediction via exploiting correlation subspace, Proc. of the 28th Int'l Conf. on Software Engineering and Knowledge Engineering, pp. 171-176, (2016)
  • [15] Zhang F, Zheng Q, Zou Y, Hassan AE., Cross-project defect prediction using a connectivity-based unsupervised classifier, Proc. of the 38th Int'l Conf. on Software Engineering, pp. 309-320, (2016)
  • [16] Li ZQ, Jing XY, Wu F, Zhu XK, Xu BW, Ying S., Cost-sensitive transfer kernel canonical correlation analysis for heterogeneous defect prediction, Automated Software Engineering, 25, 2, pp. 201-245, (2018)
  • [17] Li ZQ, Jing XY, Zhu XK, Zhang HY., Heterogeneous defect prediction through multiple kernel learning and ensemble learning, Proc. of the IEEE Int'l Conf. on Software Maintenance and Evolution, pp. 91-102, (2017)
  • [18] Tong H, Liu B, Wang S., Kernel spectral embedding transfer ensemble for heterogeneous defect prediction, IEEE Trans. on Software Engineering, (2019)
  • [19] Gong LN, Jiang SJ, Yu Q, Jiang L., Unsupervised deep domain adaptation for heterogeneous defect prediction, IEICE Trans. on Information and Systems, 102, 3, pp. 537-549, (2019)
  • [20] Chen HW, Jing XY, Li ZQ, Wu D, Peng Y, Huang ZG., An empirical study on heterogeneous defect prediction approaches, IEEE Trans. on Software Engineering, (2020)