Understanding the Automated Parameter Optimization on Transfer Learning for Cross-Project Defect Prediction: An Empirical Study

被引：39

作者：

Li, Ke ^{[2
]}

Xiang, Zilin ^{[1
]}

Chen, Tao ^{[3
]}

Wang, Shuo ^{[4
]}

Tan, Kay Chen ^{[5
]}

机构：

[1] UESTC, Coll Comp Sci & Engn, Chengdu 611731, Peoples R China

[2] Univ Exeter, Dept Comp Sci, Exeter EX4 4QF, Devon, England

[3] Loughborough Univ, Dept Comp Sci, Loughborough LE11 3TU, Leics, England

[4] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England

[5] City Univ Hong Kong, Dept Comp Sci, Tat Chee Ave, Hong Kong, Peoples R China

来源：

2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020) | 2020年

关键词：

Cross-project defect prediction; transfer learning; classification techniques; automated parameter optimization; OBJECT-ORIENTED SOFTWARE; MODELS; CLASSIFICATION; METRICS;

D O I：

10.1145/3377811.3380360

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Data-driven defect prediction has become increasingly important in software engineering process. Since it is not uncommon that data from a software project is insufficient for training a reliable defect prediction model, transfer learning that borrows data/konwledge from other projects to facilitate the model building at the current project, namely cross-project defect prediction (CPDP), is naturally plausible. Most CPDP techniques involve two major steps, i.e., transfer learning and classification, each of which has at least one parameter to be tuned to achieve their optimal performance. This practice fits well with the purpose of automated parameter optimization. However, there is a lack of thorough understanding about what are the impacts of automated parameter optimization on various CPDP techniques. In this paper, we present the first empirical study that looks into such impacts on 62 CPDP techniques, 13 of which are chosen from the existing CPDP literature while the other 49 ones have not been explored before. We build defect prediction models over 20 real-world software projects that are of different scales and characteristics. Our findings demonstrate that: (1) Automated parameter optimization substantially improves the defect prediction performance of 77% CPDP techniques with a manageable computational cost. Thus more efforts on this aspect are required in future CPDP studies. (2) Transfer learning is of ultimate importance in CPDP. Given a tight computational budget, it is more cost-effective to focus on optimizing the parameter configuration of transfer learning algorithms (3) The research on CPDP is far from mature where it is 'not difficult' to find a better alternative by making a combination of existing transfer learning and classification techniques. This finding provides important insights about the future design of CPDP techniques.

引用

页码：566 / 577

页数：12

共 50 条

[1] An Empirical Study on Combining Source Selection and Transfer Learning for Cross-Project Defect Prediction
Wen, Wanzhi
Zhang, Bin
Gu, Xiang
Ju, Xiaolin
[J]. 2019 IEEE 1ST INTERNATIONAL WORKSHOP ON INTELLIGENT BUG FIXING (IBF '19), 2019, : 29 - 38
[2] A Survey on Transfer Learning for Cross-Project Defect Prediction
Sotto-Mayor, Bruno
Kalech, Meir
[J]. IEEE ACCESS, 2024, 12 : 93398 - 93425
[3] Improving transfer learning for software cross-project defect prediction
Omondiagbe, Osayande P.
Licorish, Sherlock A.
Macdonell, Stephen G.
[J]. APPLIED INTELLIGENCE, 2024, 54 (07) : 5593 - 5616
[4] Impact of hyper parameter optimization for cross-project software defect prediction
Qu Y.
Chen X.
Zhao Y.
Ju X.
[J]. International Journal of Performability Engineering, 2018, 14 (06): : 1291 - 1299
[5] An Empirical Study of Classifier Combination for Cross-Project Defect Prediction
Zhang, Yun
Lo, David
Xia, Xin
Sun, Jianling
[J]. 39TH ANNUAL IEEE COMPUTERS, SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC 2015), VOL 2, 2015, : 264 - 269
[6] Source selection and transfer defect learning based cross-project defect prediction
Wen, Wanzhi
Zhu, Ningbo
Ye, Bingqing
Li, Xikai
Wang, Chuyue
Chu, Jiawei
Li, Yuehua
[J]. INTERNATIONAL JOURNAL OF COMPUTING SCIENCE AND MATHEMATICS, 2022, 16 (03) : 195 - 207
[7] Combined classifier for cross-project defect prediction: an extended empirical study
Yun Zhang
David Lo
Xin Xia
Jianling Sun
[J]. Frontiers of Computer Science, 2018, 12 : 280 - 296
[8] An Empirical Study of Software Metrics Diversity for Cross-Project Defect Prediction
Zhong Y.
Song K.
Lv S.
He P.
[J]. Mathematical Problems in Engineering, 2021, 2021
[9] Combined classifier for cross-project defect prediction: an extended empirical study
Zhang, Yun
Lo, David
Xia, Xin
Sun, Jianling
[J]. FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (02) : 280 - 296
[10] An Empirical Study on the Effectiveness of Feature Selection for Cross-Project Defect Prediction
Yu, Qiao
Qian, Junyan
Jiang, Shujuan
Wu, Zhenhua
Zhang, Gongjie
[J]. IEEE ACCESS, 2019, 7 : 35710 - 35718

← 1 2 3 4 5 →