Comparison of Image-Based and Text-Based Source Code Classification Using Deep Learning

被引:0
|
作者
Kiyak E.O. [1 ]
Cengiz A.B. [1 ]
Birant K.U. [2 ]
Birant D. [2 ]
机构
[1] The Graduate School of Natural and Applied Sciences, Dokuz Eylul University, Izmir
[2] Department of Computer Engineering, Dokuz Eylul University, Izmir
关键词
Deep learning; Image classification; Programming languages; Software engineering; Source code classification; Text mining;
D O I
10.1007/s42979-020-00281-1
中图分类号
学科分类号
摘要
Source code classification (SCC) is a task to assign codes into different categories according to a criterion such as according to their functionalities, programming languages or vulnerabilities. Many source code archives are organized according to the programming languages, and thereby, the desired code fragments can be easily accessed by searching within the archive. However, manually organizing source code archives by field experts is labor intensive and impractical because of the fast-growing available source codes. Therefore, this study proposes new convolutional neural network (CNN) architectures to build source code classifiers that automatically identify programming languages from source codes. This is the first study in which the performances of deep learning algorithms on programming language identification are compared on both image and text files. In this study, the experiments are performed on three source code datasets to identify eight programming languages, including C, C++, C# , Go, Python, Ruby, Rust, and Java. The comparative results indicate that although text-based SCC and image-based SCC approaches achieve very high (> 93.5 %) and similar accuracies, text-based classification has significantly better performance in terms of execution time. © 2020, Springer Nature Singapore Pte Ltd.
引用
收藏
相关论文
共 50 条
  • [41] Image-based phenotyping of disaggregated cells using deep learning
    Berryman, Samuel
    Matthews, Kerryn
    Lee, Jeong Hyun
    Duffy, Simon P.
    Ma, Hongshen
    [J]. COMMUNICATIONS BIOLOGY, 2020, 3 (01)
  • [42] Image-based phenotyping of disaggregated cells using deep learning
    Samuel Berryman
    Kerryn Matthews
    Jeong Hyun Lee
    Simon P. Duffy
    Hongshen Ma
    [J]. Communications Biology, 3
  • [43] Image-Based Monitoring of Jellyfish Using Deep Learning Architecture
    Kim, Hanguen
    Koo, Jungmo
    Kim, Donghoon
    Jung, Sungwook
    Shin, Jae-Uk
    Lee, Serin
    Myung, Hyun
    [J]. IEEE SENSORS JOURNAL, 2016, 16 (08) : 2215 - 2216
  • [44] Prediction of sloshing pressure using image-based deep learning
    Kim, Ki Jong
    Kim, Daegyoum
    [J]. OCEAN ENGINEERING, 2024, 303
  • [45] Image-based Plant Diseases Detection using Deep Learning
    Panchal, Adesh V.
    Patel, Subhash Chandra
    Bagyalakshmi, K.
    Kumar, Pankaj
    Khan, Ihtiram Raza
    Soni, Mukesh
    [J]. Materials Today: Proceedings, 2023, 80 : 3500 - 3506
  • [46] Image-based process monitoring using deep learning framework
    Lyu, Yuting
    Chen, Junghui
    Song, Zhihuan
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2019, 189 : 8 - 17
  • [47] A Survey of Image-Based Indoor Localization using Deep Learning
    Bai, Xiaolan
    Huang, May
    Prasad, Neeli Rashmi
    Mihovska, Albena Dimitrova
    [J]. 2019 22ND INTERNATIONAL SYMPOSIUM ON WIRELESS PERSONAL MULTIMEDIA COMMUNICATIONS (WPMC), 2019,
  • [48] Cartographic image watermarking using text-based normalization
    Barni, M
    Bartolini, F
    Piva, A
    Salucco, F
    [J]. 2001 IEEE FOURTH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2001, : 231 - 236
  • [49] Snore Sound Classification Using Image-based Deep Spectrum Features
    Amiriparian, Shahin
    Gerczuk, Maurice
    Ottl, Sandra
    Cummins, Nicholas
    Freitag, Michael
    Pugachevskiy, Sergey
    Baird, Alice
    Schuller, Bjoern
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3512 - 3516
  • [50] A Text-Based Deep Reinforcement Learning Framework for Interactive Recommendation
    Wang, Chaoyang
    Guo, Zhiqiang
    Li, Jianjun
    Pan, Peng
    Li, Guohui
    [J]. ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 537 - 544