Uncovering Source Code Reuse in Large-Scale Academic Environments

被引:18
|
作者
Flores, Enrique [1 ]
Barron-Cedeno, Alberto [3 ]
Moreno, Lidia [2 ]
Rosso, Paolo [2 ]
机构
[1] Univ Politecn Valencia, Dept Informat Syst & Computat, E-46022 Valencia, Spain
[2] Univ Politecn Valencia, E-46022 Valencia, Spain
[3] Univ Politecn Cataluna, Talp Res Ctr, Barcelona, Spain
关键词
source code reuse; plagiarism detection; authoring tools and methods; interactive learning environments; programming and programming languages; PLAGIARISM;
D O I
10.1002/cae.21608
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The advent of the Internet has caused an increase in content reuse, including source code. The purpose of this research is to uncover potential cases of source code reuse in large-scale environments. A good example is academia, where massive courses are taught to students who must demonstrate that they have acquired the knowledge. The need of detecting content reuse in quasi real-time encourages the development of automatic systems such as the one described in this paper for source code reuse detection. Our approach is based on the comparison of programs at character level. It is able to find potential cases of reuse across a huge number of assignments. It achieved better results than JPlag, the most used online system to find similarities among multiple sets of source codes. The most common obfuscation operations we found were changes in identifier names, comments and indentation. (c) 2014 Wiley Periodicals, Inc. Comput Appl Eng Educ 23:383-390, 2015; View this article online at ; DOI
引用
收藏
页码:383 / 390
页数:8
相关论文
共 50 条
  • [1] Understanding Source Code Comments at Large-Scale
    He, Hao
    [J]. ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2019, : 1217 - 1219
  • [2] A framework for reuse and parallelization of large-scale scientific simulation code
    Sherrill, ME
    Mancini, RC
    Harris, FC
    Dascalu, SM
    [J]. SERP '05: Proceedings of the 2005 International Conference on Software Engineering Research and Practice, Vols 1 and 2, 2005, : 52 - 58
  • [3] A Large-Scale Study on Source Code Reviewer Recommendation
    Lipcak, Jakub
    Rossi, Bruno
    [J]. 44TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2018), 2018, : 378 - 387
  • [4] ARCHITECTURES FOR LARGE-SCALE REUSE
    BECK, RP
    DESAI, SR
    RYAN, DR
    TOWER, RW
    VROOM, DQ
    WOOD, LM
    [J]. AT&T TECHNICAL JOURNAL, 1992, 71 (06): : 34 - 45
  • [5] Detection of Source Code Similitude in Academic Environments
    Bejarano, Andres M.
    Garcia, Lucy E.
    Zurek, Eduardo E.
    [J]. COMPUTER APPLICATIONS IN ENGINEERING EDUCATION, 2015, 23 (01) : 13 - 22
  • [6] DEVELOPING SOFTWARE FOR LARGE-SCALE REUSE
    SEIDEWITZ, E
    BALFOUR, B
    ADAMS, SS
    WADE, DM
    COX, B
    [J]. SIGPLAN NOTICES, 1993, 28 (10): : 137 - 143
  • [7] Images of large-scale environments
    Canter, D
    [J]. INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1996, 31 (3-4) : 5055 - 5055
  • [8] A Unified Code Review Automation for Large-scale Industry with Diverse Development Environments
    Kim, Hyungjin
    Kwon, Yonghwi
    Kwon, Hyukin
    Ryou, Yeonhee
    Joh, Sangwoo
    Kim, Taeksu
    Kim, Chul-Joo
    [J]. 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2022), 2022, : 23 - 24
  • [9] Environments for large-scale optimization
    Bouaricha, A
    More, JJ
    [J]. ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1996, 76 : 37 - 39
  • [10] Localisation in large-scale environments
    Bailey, T
    Nebot, E
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2001, 37 (04) : 261 - 281