Feature Representation Method for Heterogeneous Defect Prediction Based on Variational Autoencoders

被引:0
|
作者
Jia X.-Y. [1 ]
Zhang W.-Z. [1 ]
Li W.-W. [2 ]
Huang Z.-Q. [3 ]
机构
[1] School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing
[2] College of Aerospace Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing
[3] College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing
来源
Ruan Jian Xue Bao/Journal of Software | 2021年 / 32卷 / 07期
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Feature representation; Heterogeneous defect prediction; Variational autoencoders;
D O I
10.13328/j.cnki.jos.006257
中图分类号
学科分类号
摘要
Cross-project defect prediction technology can use the existing labeled defect data to predict new unlabeled data, but it needs to have the same metric features for two projects, which is difficult to be applied in actual development. Heterogeneous defect prediction can perform prediction without requiring the source and target project to have the same set of metrics and thus has attracted great interest. Existing heterogeneous defect prediction models use naive or traditional machine learning methods to learn feature representations between source and target projects, and perform prediction based on it. The feature representation learned by previous studies is weak, causing poor performance in predicting defect-prone instances. In view of the powerful feature extraction and representation capabilities of deep neural networks, this study proposes a feature representation method for heterogeneous defect prediction based on variational autoencoders. By combining the variational autoencoder and maximum mean discrepancy, this method can effectively learn the common feature representation of the source and target projects. Then, an effective defect prediction model can be trained based on it. The validity of the proposed method is verified by comparing it with traditional cross-project defect prediction methods and heterogeneous defect prediction methods on various datasets. © Copyright 2021, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:2204 / 2218
页数:14
相关论文
共 29 条
  • [1] Hall T, Beecham S, Bowes D, Gray D, Counsell S., A systematic literature review on fault prediction performance in software engineering, IEEE Trans. on Software Engineering, 38, 6, pp. 1276-1304, (2011)
  • [2] Chen X, Gu Q, Liu WS, Liu WS, Liu SL, Ni C., Survey of static software defect prediction, Ruan Jian Xue Bao/Journal of Software, 27, 1, pp. 1-25, (2016)
  • [3] D'Ambros M, Lanza M, Robbes R., Evaluating defect prediction approaches: A benchmark and an extensive comparison, Empirical Software Engineering, 17, 4-5, pp. 531-577, (2012)
  • [4] Lee T, Nam J, Han D, Kim S., Developer micro interaction metrics for software defect prediction, IEEE Trans. on Software Engineering, 42, 11, pp. 1015-1035, (2016)
  • [5] Menzies T, Greenwald J, Frank A., Data mining static code attributes to learn defect predictors, IEEE Trans. on Software Engineering, 33, 1, pp. 2-13, (2006)
  • [6] Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B., Cross-project defect prediction: A large scale experiment on data vs. domain vs. process, Proc. of the 7th Joint Meeting of the European Software Engineering Conf. and the ACM SIGSOFT Symp. on the Foundations of Software Engineering, pp. 91-100, (2009)
  • [7] He Z, Shu F, Yang Y, Li MS, Wang Q., An investigation on the feasibility of cross-project defect prediction, Automated Software Engineering, 19, 2, pp. 167-199, (2012)
  • [8] Nam J, Pan SJ, Kim S., Transfer defect learning, Proc. of the 35th Int'l Conf. on Software Engineering, pp. 382-391, (2013)
  • [9] Ma Y, Luo GC, Zeng X, Chen AG., Transfer learning for cross-company software defect prediction, Information and Software Technology, 54, 3, pp. 248-256, (2012)
  • [10] Nam J, Fu W, Kim S, Menzies T, Tan L., Heterogeneous defect prediction, IEEE Trans. on Software Engineering, 44, 9, pp. 874-896, (2017)