Source Code Representations for Plagiarism Detection

被引:1
|
作者
Duracik, Michal [1 ]
Krsak, Emil [1 ]
Hrkut, Patrik [1 ]
机构
[1] Univ Zilina, Fac Management Sci & Informat, Univ 8215-1, Zilina 01026, Slovakia
关键词
Source code; Representations; Hash; Characteristic vector;
D O I
10.1007/978-3-319-95522-3_6
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
At the present time the plagiarism is a growing problem due to a lot of easily accessible resources, and many papers deal with this topic. New algorithms are constantly being created, but there are not currently manny of systems, that we could use for plagiarism detection. Our aim is to explore plagiarism on a large scale. This paper focuses on selecting the appropriate representation of the source code, that is very important when searching for plagiarism. There is an overview of the current representation possibilities. We focus on representation source code using AST. Comparison of the tree structures is time-consuming operation. We will try to find how effectively represent AST in order to facilitate comparison. There are two ways to represent AST. Representation by hashing or using characteristic vectors. We present the experiment and results on which we choose the appropriate form of the representation.
引用
收藏
页码:61 / 69
页数:9
相关论文
共 50 条
  • [41] Process Model Improvement for Source Code Plagiarism Detection in Student Programming Assignments
    Kermek, Dragutin
    Novak, Matija
    [J]. INFORMATICS IN EDUCATION, 2016, 15 (01): : 103 - 126
  • [42] The Source Code Plagiarism Detection based on Function Sub-string Matching
    Xiao JingZhong
    Xiao Li
    [J]. 2011 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER SCIENCE AND APPLICATION (FCSA 2011), VOL 1, 2011, : 397 - 400
  • [43] Efficient clustering-based source code plagiarism detection using PIY
    Tony Ohmann
    Imad Rahal
    [J]. Knowledge and Information Systems, 2015, 43 : 445 - 472
  • [44] Efficient clustering-based source code plagiarism detection using PIY
    Ohmann, Tony
    Rahal, Imad
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 43 (02) : 445 - 472
  • [45] Style Analysis for Source Code Plagiarism Detection - an Analysis of a Dataset of Student Coursework
    Mirza, Olfat M.
    Joy, Mike
    Cosma, Georgina
    [J]. 2017 IEEE 17TH INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES (ICALT), 2017, : 296 - 297
  • [46] Design Patterns based Pre-processing of Source Code for Plagiarism Detection
    Asadullah, Allahbaksh
    Basavaraju, M.
    Stern, Ilan
    Bhat, Vasudev D.
    [J]. 2012 19TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE WORKSHOPS (APSECW), VOL. 2, 2012, : 128 - 135
  • [47] Student perspectives on source-code plagiarism
    Joy, M. S.
    Sinclair, J. E.
    Boyatt, R.
    Yau, J. Y-K.
    Cosma, G.
    [J]. INTERNATIONAL JOURNAL FOR EDUCATIONAL INTEGRITY, 2013, 9 (01): : 3 - 19
  • [48] Source Code Plagiarism-A Student Perspective
    Joy, Mike
    Cosma, Georgina
    Yau, Jane Yin-Kim
    Sinclair, Jane
    [J]. IEEE TRANSACTIONS ON EDUCATION, 2011, 54 (01) : 125 - 132
  • [49] Retrieving and classifying instances of source code plagiarism
    Ganguly, Debasis
    Jones, Gareth J. F.
    Ramirez-de-la-Cruz, Aaron
    Ramirez-de-la-Rosa, Gabriela
    Villatoro-Tello, Esau
    [J]. INFORMATION RETRIEVAL JOURNAL, 2018, 21 (01): : 1 - 23
  • [50] Towards a definition of source-code plagiarism
    Cosma, Georgina
    Joy, Mike
    [J]. IEEE TRANSACTIONS ON EDUCATION, 2008, 51 (02) : 195 - 200