An Empirical Study of Deep Transfer Learning-Based Program Repair for Kotlin Projects

被引:3
|
作者
Kim, Misoo [1 ]
Kim, Youngkyoung [2 ]
Jeong, Hohyeon [2 ]
Heo, Jinseok [2 ]
Kim, Sungoh [3 ]
Chung, Hyunhee [3 ]
Lee, Eunseok [4 ]
机构
[1] Sungkyunkwan Univ, Inst Software Convergence, Suwon, South Korea
[2] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon, South Korea
[3] Samsung Elect, SW Engn Grp, Mobile Experience, Suwon, South Korea
[4] Sungkyunkwan Univ, Coll Comp & Informat, Suwon, South Korea
基金
新加坡国家研究基金会;
关键词
Empirical study; Deep learning-based program repair; Transfer learning; Industrial Kotlin project; SonarQube defects; SONARQUBE;
D O I
10.1145/3540250.3558967
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Deep learning-based automated program repair (DL-APR) can automatically fix software bugs and has received significant attention in the industry because of its potential to significantly reduce software development and maintenance costs. The Samsung mobile experience (MX) team is currently switching from Java to Kotlin projects. This study reviews the application of DL-APR, which automatically fixes defects that arise during this switching process; however, the shortage of Kotlin defect-fixing datasets in Samsung MX team precludes us from fully utilizing the power of deep learning. Therefore, strategies are needed to effectively reuse the pretrained DL-APR model. This demand can be met using the Kotlin defect-fixing datasets constructed from industrial and open-source repositories, and transfer learning. This study aims to validate the performance of the pretrained DL-APR model in fixing defects in the Samsung Kotlin projects, then improve its performance by applying transfer learning. We show that transfer learning with open source and industrial Kotlin defect-fixing datasets can improve the defect-fixing performance of the existing DL-APR by 307%. Furthermore, we confirmed that the performance was improved by 532% compared with the baseline DL-APR model as a result of transferring the knowledge of an industrial (non-defect) bug-fixing dataset. We also discovered that the embedded vectors and overlapping code tokens of the code-change pairs are valuable features for selecting useful knowledge transfer instances by improving the performance of APR models by up to 696%. Our study demonstrates the possibility of applying transfer learning to practitioners who review the application of DL-APR to industrial software.
引用
收藏
页码:1441 / 1452
页数:12
相关论文
共 50 条
  • [11] Phishing Webpage Classification via Deep Learning-Based Algorithms: An Empirical Study
    Nguyet Quang Do
    Selamat, Ali
    Krejcar, Ondrej
    Yokoi, Takeru
    Fujita, Hamido
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (19):
  • [12] Deep Learning-Based Transfer Learning for Classification of Skin Cancer
    Jain, Satin
    Singhania, Udit
    Tripathy, Balakrushna
    Nasr, Emad Abouel
    Aboudaif, Mohamed K.
    Kamrani, Ali K.
    [J]. SENSORS, 2021, 21 (23)
  • [13] Effectiveness of Transfer Learning for Deep Learning-Based Electrocardiogram Analysis
    Jang, Jong-Hwan
    Kim, Tae Young
    Yoon, Dukyong
    [J]. HEALTHCARE INFORMATICS RESEARCH, 2021, 27 (01) : 19 - 28
  • [14] Transfer Learning Strategies for Deep Learning-based PHM Algorithms
    Yang, Fan
    Zhang, Wenjin
    Tao, Laifa
    Ma, Jian
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (07):
  • [15] Deep Learning-Based Application of Image Style Transfer
    Liao, Yimi
    Huang, Youfu
    [J]. Mathematical Problems in Engineering, 2022, 2022
  • [16] Deep Learning-Based Application of Image Style Transfer
    Liao, YiMi
    Huang, YouFu
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [17] An Empirical Study on Program Failures of Deep Learning Jobs
    Zhang, Ru
    Xiao, Wencong
    Zhang, Hongyu
    Liu, Yu
    Lin, Haoxiang
    Yang, Mao
    [J]. 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 1159 - 1170
  • [18] An empirical study of deep learning-based feature extractor models for imbalanced image classification
    Ammara Khan
    Muhammad Tahir Rasheed
    Hufsa Khan
    [J]. Advances in Computational Intelligence, 2023, 3 (6):
  • [19] Deep Learning-based Object Detection in High Resolution UAV Images: An Empirical Study
    Zhang, Haijun
    Sun, Mingshan
    Ji, Yuzhu
    Xu, Shichao
    Cao, Weihan
    [J]. 2019 IEEE 17TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2019, : 886 - 889
  • [20] Is Your Policy Compliant? A Deep Learning-based Empirical Study of Privacy Policies' Compliance with GDPR
    Al Rahat, Tamjid
    Long, Minjun
    Tian, Yuan
    [J]. PROCEEDINGS OF THE 21ST WORKSHOP ON PRIVACY IN THE ELECTRONIC SOCIETY, WPES 2022, 2022, : 89 - 102