Plagiarism Detection System for Indonesia Text Based Document by Fingerprint Method and Natural Language Processing Approach

被引:0
|
作者
Winarti, Titin [1 ]
Kerami, Djati [2 ]
Etp, Lussiana [3 ]
Sekarwati, Kemal Ade [4 ]
机构
[1] Semarang Univ, Fac Informat Technol & Commun, Semarang 50196, Indonesia
[2] Indonesia Univ, Fac Math & Nat Sci, Depok 16424, Indonesia
[3] Sch Informat Management & Comp Jakarta, Comp Syst, Jakarta 12140, Indonesia
[4] Gunadarma Univ, Fac Comp Sci & Informat Technol, Jakarta 16424, Indonesia
关键词
Plagiarism; Fingerprint; Natural Language Processing;
D O I
10.1166/asl.2016.7993
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The practice of plagiarism is very often carried out in a community environment for example in academia. So it can be stated that plagiarism is a major concern, especially in the academic environment, where it can affect both the credibility of the institution and its ability to ensure the quality of its students. In other words, the act of plagiarism may result in a decrease of creativity in the community. This research uses a combination of fingerprint method with natural language processing (NLP) approach. With the process or plagiarism detection system can be done through various methods, such as by the method of calculation algorithms Manber the similarities using the Jaccard coefficient and K-gram method as an alternative in the detection of document similarity, is expected to allow a user to use the application this without deciding the value of gram and its window to produce an accurate similarity value. Although it has been proven NLP techniques can improve the accuracy of detection tasks, there are other challenges remain. Current plagiarism detection tools are mostly limited to comparisons of suspicious plagiarised texts and potential original texts at string level. By doing stemming, the document similarity measurement process there was an increase of 31% measurement document based on documents that were tested.
引用
收藏
页码:3128 / 3131
页数:4
相关论文
共 50 条
  • [1] TextProc - a natural language processing framework and its use as plagiarism detection system
    Brezovnik, Janez
    Ojstersek, Milan
    INTERNATIONAL JOURNAL OF EDUCATION AND INFORMATION TECHNOLOGIES, 2011, 5 (03): : 293 - 300
  • [2] String Matching based Plagiarism Detection for Document in Bahasa Indonesia
    Parwita, Wayan Gede Suka
    Indradewi, I. Gusti Ayu Agung Diatri
    Wijaya, I. Nyoman Saputra Wahyu
    PROCEEDINGS OF 2019 5TH INTERNATIONAL CONFERENCE ON NEW MEDIA STUDIES (CONMEDIA 2019), 2019, : 54 - 58
  • [3] Towards Document Plagiarism Detection Based on the Relevance and Fragmentation of the Reused Text
    Sanchez-Vega, Fernando
    Villasenor-Pineda, Luis
    Montes-y-Gomez, Manuel
    Rosso, Paolo
    ADVANCES IN ARTIFICIAL INTELLIGENCE, MICAI 2010, PT I, 2010, 6437 : 24 - 31
  • [4] Document copy detection system based on plagiarism patterns
    Kang, NO
    Han, SY
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2006, 3878 : 571 - 574
  • [5] A NATURAL LANGUAGE PROGRAMMING SYSTEM FOR TEXT PROCESSING
    BARNETT, MP
    RUHSAM, WM
    IEEE TRANSACTIONS ON ENGINEERING WRITING AND SPEECH, 1968, EW11 (02): : 45 - &
  • [6] Flowchart Plagiarism Detection System: An Image Processing Approach
    Kuruvila, Jithin S.
    Lal, Midhun V. L.
    Roy, Rejin
    Baby, Tomin
    Jamal, Sangeetha
    Sherly, K. K.
    7TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATIONS (ICACC-2017), 2017, 115 : 533 - 540
  • [7] Algorithm of the Longest Commonly Consecutive Word for Plagiarism Detection in Text Based Document
    Sediyono, Agung
    Ku-Mahamud, Ku Ruhana
    2008 THIRD INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT, VOLS 1 AND 2, 2008, : 257 - +
  • [8] A Ranking-Based Text Matching Approach for Plagiarism Detection
    Kong, Leilei
    Han, Zhongyuan
    Qi, Haoliang
    Lu, Zhimao
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2018, E101A (05) : 799 - 810
  • [9] INTERACTIVE DOCUMENT RETRIEVAL SYSTEM BASED-ON NATURAL LANGUAGE QUERY PROCESSING
    Dang Tuan Nguyen
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 2233 - 2237
  • [10] Content authentication and tampering detection of Arabic text: an approach based on zero-watermarking and natural language processing
    Anwer Mustafa Hilal
    Fahd N. Al-Wesabi
    Manar Ahmed Hamza
    Mohammed Medani
    Khalid Mahmood
    Mohammad Mahzari
    Pattern Analysis and Applications, 2022, 25 : 47 - 62