Investigating the Generalizability of Deep Learning-based Clone Detectors

被引:1
|
作者
Choi, Eunjong [1 ]
Fuke, Norihiro [2 ]
Fujiwara, Yuji [2 ]
Yoshida, Norihiro [3 ]
Inoue, Katsuro [4 ]
机构
[1] Kyoto Inst Technol, Kyoto, Japan
[2] Osaka Univ, Osaka, Japan
[3] Ritsumeikan Univ, Kyoto, Japan
[4] Nanzan Univ, Nagoya, Aichi, Japan
关键词
code clone; deep learning; generalizability; CODE;
D O I
10.1109/ICPC58990.2023.00032
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The generalizability of Deep Learning (DL) models is a significant challenge, as poor generalizability indicates that the model has overfitted to the training data and is not able to generalize to new data. Despite numerous DL-based clone detectors emerging in recent years, their generalizability has not been thoroughly assessed. This study investigates the generalizability of three DL-based clone detectors (CCLearner, ASTNN, and CodeBERT) by comparing their detection accuracy on different training and testing clone benchmarks. The results show that all three clone detectors do not generalize well to new data and there is a strong relationship between clone types and generalizability for CCLearner and ASTNN.
引用
收藏
页码:181 / 185
页数:5
相关论文
共 50 条
  • [41] A Review of the Impacts of Defogging on Deep Learning-Based Object Detectors in Self-Driving Cars
    Ogunrinde, Isaac
    Bernadin, Shonda
    SOUTHEASTCON 2021, 2021, : 425 - 432
  • [42] Generalizability and quality control of deep learning-based 2D echocardiography segmentation models in a large clinical dataset
    Xiaoyan Zhang
    Alvaro E. Ulloa Cerna
    Joshua V. Stough
    Yida Chen
    Brendan J. Carry
    Amro Alsaid
    Sushravya Raghunath
    David P. vanMaanen
    Brandon K. Fornwalt
    Christopher M. Haggerty
    The International Journal of Cardiovascular Imaging, 2022, 38 : 1685 - 1697
  • [43] Performance of a deep learning-based CT image denoising method: Generalizability over dose, reconstruction kernel, and slice thickness
    Zeng, Rongping
    Lin, Claire Yilin
    Li, Qin
    Jiang, Lu
    Skopec, Marlene
    Fessler, Jeffrey A.
    Myers, Kyle J.
    MEDICAL PHYSICS, 2022, 49 (02) : 836 - 853
  • [44] Generalizability and quality control of deep learning-based 2D echocardiography segmentation models in a large clinical dataset
    Zhang, Xiaoyan
    Cerna, Alvaro E. Ulloa
    Stough, Joshua, V
    Chen, Yida
    Carry, Brendan J.
    Alsaid, Amro
    Raghunath, Sushravya
    VanMaanen, David P.
    Fornwalt, Brandon K.
    Haggerty, Christopher M.
    INTERNATIONAL JOURNAL OF CARDIOVASCULAR IMAGING, 2022, 38 (08): : 1685 - 1697
  • [45] Generalizability assessment of COVID-19 3D CT data for deep learning-based disease detection
    Fallahpoor, Maryam
    Chakraborty, Subrata
    Heshejin, Mohammad Tavakoli
    Chegeni, Hossein
    Horry, Michael James
    Pradhan, Biswajeet
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 145
  • [46] MPass: Bypassing Learning-based Static Malware Detectors
    Wang, Jialai
    Qu, Wenjie
    Rong, Yi
    Qiu, Han
    Li, Qi
    Li, Zongpeng
    Zhang, Chao
    2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [47] Deep learning-based modelling of pyrolysis
    Alper Ozcan
    Ahmet Kasif
    Ismail Veli Sezgin
    Cagatay Catal
    Muhammad Sanwal
    Hasan Merdun
    Cluster Computing, 2024, 27 : 1089 - 1108
  • [48] Deep Learning-Based SNR Estimation
    Zheng, Shilian
    Chen, Shurun
    Chen, Tao
    Yang, Zhuang
    Zhao, Zhijin
    Yang, Xiaoniu
    IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY, 2024, 5 : 4778 - 4796
  • [49] Deep Learning-Based Channel Estimation
    Soltani, Mehran
    Pourahmadi, Vahid
    Mirzaei, Ali
    Sheikhzadeh, Hamid
    IEEE COMMUNICATIONS LETTERS, 2019, 23 (04) : 652 - 655
  • [50] Deep learning-based fall detection
    Chiang, Jason Wei Hoe
    Zhang, Li
    DEVELOPMENTS OF ARTIFICIAL INTELLIGENCE TECHNOLOGIES IN COMPUTATION AND ROBOTICS, 2020, 12 : 891 - 898