Source Code Representations for Plagiarism Detection

被引:1
|
作者
Duracik, Michal [1 ]
Krsak, Emil [1 ]
Hrkut, Patrik [1 ]
机构
[1] Univ Zilina, Fac Management Sci & Informat, Univ 8215-1, Zilina 01026, Slovakia
关键词
Source code; Representations; Hash; Characteristic vector;
D O I
10.1007/978-3-319-95522-3_6
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
At the present time the plagiarism is a growing problem due to a lot of easily accessible resources, and many papers deal with this topic. New algorithms are constantly being created, but there are not currently manny of systems, that we could use for plagiarism detection. Our aim is to explore plagiarism on a large scale. This paper focuses on selecting the appropriate representation of the source code, that is very important when searching for plagiarism. There is an overview of the current representation possibilities. We focus on representation source code using AST. Comparison of the tree structures is time-consuming operation. We will try to find how effectively represent AST in order to facilitate comparison. There are two ways to represent AST. Representation by hashing or using characteristic vectors. We present the experiment and results on which we choose the appropriate form of the representation.
引用
收藏
页码:61 / 69
页数:9
相关论文
共 50 条
  • [31] Source Code Plagiarism Detection Using Biological String Similarity Algorithms
    Rahal, Imad
    Wielga, Colin
    [J]. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2014, 13 (03)
  • [32] Version History Based Source Code Plagiarism Detection in Proprietary Systems
    Maskeri, Girish
    Karnam, Deepthi
    Viswanathan, Sree Aurovindh
    Padmanabhuni, Srinivas
    [J]. 2012 28TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE (ICSM), 2012, : 609 - 612
  • [33] Method and its system of Java source and byte code plagiarism detection
    Li, Hu
    Liu, Chao
    Liu, Nan
    Li, Xiaoli
    [J]. Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2010, 36 (04): : 424 - 428
  • [34] Generating Pylogenetic Tree of Homogeneous Source Code in a Plagiarism Detection System
    Ji, Jeong-Hoon
    Park, Su-Hyun
    Woo, Gyun
    Cho, Hwan-Gue
    [J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2008, 6 (06) : 809 - 817
  • [35] Source Code Plagiarism Detection in Academia with Information Retrieval: Dataset and the Observation
    Karnalim, Oscar
    Budi, Setia
    Toba, Hapnes
    Joy, Mike
    [J]. INFORMATICS IN EDUCATION, 2019, 18 (02): : 321 - 344
  • [36] Source Code Plagiarism Detection Based on Abstract Syntax Tree Fingerprintings
    Suttichaya, Vasin
    Eakvorachai, Niracha
    Lurkraisit, Tunchanok
    [J]. 2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,
  • [37] Evolution analysis of homogenous source code and its application to plagiarism detection
    Ji, Jeong-Hoon
    Park, Su-Hyun
    Woo, Gyun
    Cho, Hwan-Gue
    [J]. PROCEEDINGS OF THE FRONTIERS IN THE CONVERGENCE OF BIOSCIENCE AND INFORMATION TECHNOLOGIES, 2007, : 813 - 818
  • [38] Academic Source Code Plagiarism Detection by Measuring Program Behavioral Similarity
    Cheers, Hayden
    Lin, Yuqing
    Smith, Shamus P.
    [J]. IEEE ACCESS, 2021, 9 : 50391 - 50412
  • [39] Application of Source Code Plagiarism Detection and Grouping Techniques for Short Programs
    Ryman, Dylan
    Imbrie, P. K.
    Kastner, Jeff
    [J]. 2021 IEEE FRONTIERS IN EDUCATION CONFERENCE (FIE 2021), 2021,
  • [40] Detecting source-code plagiarism
    Zeidman, B
    [J]. DR DOBBS JOURNAL, 2004, 29 (07): : 57 - 60