Fast Plagiarism Detection by Sentence Hashing

被引:0
|
作者
Ceglarek, Dariusz [1 ]
Haniewicz, Konstanty [2 ]
机构
[1] Poznan Sch Banking, Poznan, Poland
[2] Poznan Univ Econ, Poznan, Poland
关键词
plagiarism; plagiarism detection; longest common subsequence; semantic compression; SEIPro2S; SEMANTIC COMPRESSION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work presents a Sentence Hashing Algorithm for Plagiarism Detection - SHAPD. To present a user with the best results the algorithm makes use of special trait of the written texts - their natural sentence fragmentation, later employing a set of special techniques for text representation. Results obtained demonstrate that the algorithm delivers solution faster than the alternatives. Its algorithmic complexity is logarithmic, thus its performance is better than most algorithms using dynamic programming used to find the longest common subsequence.
引用
收藏
页码:30 / 37
页数:8
相关论文
共 50 条
  • [21] Fast anomaly detection with locality-sensitive hashing and hyperparameter autotuning
    Meira, Jorge
    Eiras-Franco, Carlos
    Bolon-Canedo, Veronica
    Marreiros, Goreti
    Alonso-Betanzos, Amparo
    INFORMATION SCIENCES, 2022, 607 : 1245 - 1264
  • [22] DETECTION OF PLAGIARISM
    BJAALAND, PC
    LEDERMAN, A
    EDUCATIONAL FORUM, 1973, 37 (02): : 201 - 206
  • [23] Towards Building an Arabic Plagiarism Detection System: Plagiarism Detection in Arabic
    Khan, Imtiaz Hussain
    Siddiqui, Muazzam Ahmed
    Jambi, Kamal M.
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2019, 9 (03) : 12 - 22
  • [24] Fast Supervised Discrete Hashing
    Gui, Jie
    Liu, Tongliang
    Sun, Zhenan
    Tao, Dacheng
    Tan, Tieniu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (02) : 490 - 496
  • [25] Sentence-Based Plagiarism Detection for Japanese Document Based on Common Nouns and Part-of-Speech Structure
    Yokoi, Takeru
    INTELLIGENT SOFTWARE METHODOLOGIES, TOOLS AND TECHNIQUES, SOMET 2014, 2015, 513 : 297 - 308
  • [26] Fast Scalable Supervised Hashing
    Luo, Xin
    Nie, Liqiang
    He, Xiangnan
    Wu, Ye
    Chen, Zhen-Duo
    Xu, Xin-Shun
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 735 - 744
  • [27] 'Each Sentence Is Into the Fast'
    Wenderoth, J
    AMERICAN POETRY REVIEW, 1998, 27 (03): : 25 - 25
  • [28] THERE IS NO FAST SINGLE HASHING ALGORITHM
    AJTAI, M
    KOMLOS, J
    SZEMEREDI, E
    INFORMATION PROCESSING LETTERS, 1978, 7 (06) : 270 - 273
  • [29] Plagiarism detection in arXiv
    Sorokina, Daria
    Gehrke, Johannes
    Warner, Simeon
    Ginsparg, Paul
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 1070 - 1075
  • [30] Multilingual plagiarism detection
    Ceska, Zdenek
    Toman, Michal
    Jezek, Karel
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, 2008, 5253 : 83 - 92