Source-code Similarity Detection and Detection Tools Used in Academia: A Systematic Review

被引:61
|
作者
Novak, Matija [1 ]
Joy, Mike [2 ]
Kermek, Dragutin [1 ]
机构
[1] Univ Zagreb, Fac Org & Informat, Pavlinska 2, Varazhdin 42000, Croatia
[2] Univ Warwick, Dept Comp Sci, Coventry CV4 7AL, W Midlands, England
关键词
Source-code; plagiarism; similarity; detection; academia; education; programming; systematic review; PLAGIARISM DETECTION; PROGRAMS; TREE; SET;
D O I
10.1145/3313290
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Teachers deal with plagiarism on a regular basis, so they try to prevent and detect plagiarism, a task that is complicated by the large size of some classes. Students who cheat often try to hide their plagiarism (obfuscate), and many different similarity detection engines (often called plagiarism detection tools) have been built to help teachers. This article focuses only on plagiarism detection and presents a detailed systematic review of the field of source-code plagiarism detection in academia. This review gives an overview of definitions of plagiarism, plagiarism detection tools, comparison metrics, obfuscation methods, datasets used for comparison, and algorithm types. Perspectives on the meaning of source-code plagiarism detection in academia are presented, together with categorisations of the available detection tools and analyses of their effectiveness. While writing the review, some interesting insights have been found about metrics and datasets for quantitative tool comparison and categorisation of detection algorithms. Also, existing obfuscation methods classifications have been expanded together with a new definition of "source-code plagiarism detection in academia."
引用
收藏
页数:37
相关论文
共 50 条
  • [1] Review of source-code plagiarism detection in academia
    Novak, Matija
    [J]. 2016 39TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2016, : 796 - 801
  • [2] Calibration of source-code similarity detection tools for objective comparisons
    Novak, M.
    Kermek, D.
    Joy, M.
    [J]. 2018 41ST INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2018, : 794 - 799
  • [3] Evaluating the Performance of LSA for Source-code Plagiarism Detection
    Cosma, Georgina
    Joy, Mike
    [J]. INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2012, 36 (04): : 409 - 424
  • [4] A systematic literature review on source code similarity measurement and clone detection: Techniques, applications, and challenges
    Zakeri-Nasrabadi, Morteza
    Parsa, Saeed
    Ramezani, Mohammad
    Roy, Chanchal
    Ekhtiarzadeh, Masoud
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 204
  • [5] A Source Code Similarity System for Plagiarism Detection
    Duric, Zoran
    Gasevic, Dragan
    [J]. COMPUTER JOURNAL, 2013, 56 (01): : 70 - 86
  • [6] Android Source Code Vulnerability Detection: A Systematic Literature Review
    Senanayake, Janaka
    Kalutarage, Harsha
    Al-Kadri, Mhd Omar
    Petrovski, Andrei
    Piras, Luca
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (09)
  • [7] On the relationship between source-code metrics and cognitive load: A systematic tertiary review
    Abbad-Andaloussi, Amine
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 198
  • [8] Scalable Source Code Similarity Detection in Large Code Repositories
    Alomari, Firas
    Harbi, Muhammed
    [J]. EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2019, 6 (22) : 1 - 11
  • [9] An Assessment of Vulnerable Detection Source Code Tools
    Verma, Anoop Kumar
    Sharma, Aman Kumar
    [J]. SOFTWARE ENGINEERING (CSI 2015), 2019, 731 : 403 - 412
  • [10] An Approach to Source-Code Plagiarism Detection and Investigation Using Latent Semantic Analysis
    Cosma, Georgina
    Joy, Mike
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2012, 61 (03) : 379 - 394