Understanding the Automated Parameter Optimization on Transfer Learning for Cross-Project Defect Prediction: An Empirical Study

被引:39
|
作者
Li, Ke [2 ]
Xiang, Zilin [1 ]
Chen, Tao [3 ]
Wang, Shuo [4 ]
Tan, Kay Chen [5 ]
机构
[1] UESTC, Coll Comp Sci & Engn, Chengdu 611731, Peoples R China
[2] Univ Exeter, Dept Comp Sci, Exeter EX4 4QF, Devon, England
[3] Loughborough Univ, Dept Comp Sci, Loughborough LE11 3TU, Leics, England
[4] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England
[5] City Univ Hong Kong, Dept Comp Sci, Tat Chee Ave, Hong Kong, Peoples R China
关键词
Cross-project defect prediction; transfer learning; classification techniques; automated parameter optimization; OBJECT-ORIENTED SOFTWARE; MODELS; CLASSIFICATION; METRICS;
D O I
10.1145/3377811.3380360
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Data-driven defect prediction has become increasingly important in software engineering process. Since it is not uncommon that data from a software project is insufficient for training a reliable defect prediction model, transfer learning that borrows data/konwledge from other projects to facilitate the model building at the current project, namely cross-project defect prediction (CPDP), is naturally plausible. Most CPDP techniques involve two major steps, i.e., transfer learning and classification, each of which has at least one parameter to be tuned to achieve their optimal performance. This practice fits well with the purpose of automated parameter optimization. However, there is a lack of thorough understanding about what are the impacts of automated parameter optimization on various CPDP techniques. In this paper, we present the first empirical study that looks into such impacts on 62 CPDP techniques, 13 of which are chosen from the existing CPDP literature while the other 49 ones have not been explored before. We build defect prediction models over 20 real-world software projects that are of different scales and characteristics. Our findings demonstrate that: (1) Automated parameter optimization substantially improves the defect prediction performance of 77% CPDP techniques with a manageable computational cost. Thus more efforts on this aspect are required in future CPDP studies. (2) Transfer learning is of ultimate importance in CPDP. Given a tight computational budget, it is more cost-effective to focus on optimizing the parameter configuration of transfer learning algorithms (3) The research on CPDP is far from mature where it is 'not difficult' to find a better alternative by making a combination of existing transfer learning and classification techniques. This finding provides important insights about the future design of CPDP techniques.
引用
收藏
页码:566 / 577
页数:12
相关论文
共 50 条
  • [31] An empirical evaluation of defect prediction approaches in within-project and cross-project context
    Bhat, Nayeem Ahmad
    Farooq, Sheikh Umar
    SOFTWARE QUALITY JOURNAL, 2023, 31 (03) : 917 - 946
  • [32] An empirical evaluation of defect prediction approaches in within-project and cross-project context
    Nayeem Ahmad Bhat
    Sheikh Umar Farooq
    Software Quality Journal, 2023, 31 : 917 - 946
  • [33] Cross-project clone consistent-defect prediction via transfer-learning method
    Jiang, Wenchao
    Qiu, Shaojian
    Liang, Tiancai
    Zhang, Fanlong
    INFORMATION SCIENCES, 2023, 635 : 138 - 150
  • [34] Cross-Project Defect Prediction Based on Domain Adaptation and LSTM Optimization
    Javed, Khadija
    Ren, Shengbing
    Asim, Muhammad
    Wani, Mudasir Ahmad
    ALGORITHMS, 2024, 17 (05)
  • [35] A Comparative Study to Benchmark Cross-project Defect Prediction Approaches
    Herbold, Steffen
    Trautsch, Alexander
    Grabowski, Jens
    PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, : 1063 - 1063
  • [36] Cross-project bug type prediction based on transfer learning
    Xiaoting Du
    Zenghui Zhou
    Beibei Yin
    Guanping Xiao
    Software Quality Journal, 2020, 28 : 39 - 57
  • [37] A Framework for Homogeneous Cross-Project Defect Prediction
    Goel, Lipika
    Sharma, Mayank
    Khatri, Sunil Kumar
    Damodaran, D.
    INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2021, 9 (01) : 52 - 68
  • [38] Addressing Noise and Class Imbalance Problems in Heterogeneous Cross-Project Defect Prediction: An Empirical Study
    Vashisht, Rohit
    Rizvi, Syed Afzal Murtaza
    INTERNATIONAL JOURNAL OF E-COLLABORATION, 2023, 19 (01)
  • [39] Assessing the Effect of Imbalanced Learning on Cross-project Software Defect Prediction
    Sohan, Md Fahimuzzman
    Jabiullah, Md Ismail
    Rahman, Sheikh Shah Mohammad Motiur
    Mahmud, S. M. Hasan
    2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,
  • [40] An investigation on the feasibility of cross-project defect prediction
    Zhimin He
    Fengdi Shu
    Ye Yang
    Mingshu Li
    Qing Wang
    Automated Software Engineering, 2012, 19 : 167 - 199