Cross-Language Plagiarism Detection Model Based On Multiple Features

被引:0
|
作者
Liu, Gang [1 ,2 ]
Dong, Yichao [1 ]
Li, Guangxi [1 ]
机构
[1] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin, Peoples R China
[2] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China
关键词
Feature Selection; Candidate Retrieval; Translation Features; Cross-Language; Dictionary;
D O I
10.1109/ISCC53001.2021.9631406
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As information sharing becomes more and more convenient, a lot of phenomena of plagiarism shows up. The study of cross-language plagiarism is an important problem that the whole academic circle tries to solve it collectively. In this paper, a multiple-features based cross-language plagiarism detection model is proposed, which includes cross-language plagiarism candidate retrieval based on multiple features and cross-language plagiarism detection based on dynamic text alignment. For cross-language plagiarism candidate retrieval, it is mainly based on the translation features. What's more, for cross-language plagiarism detection, a text-alignment based similarity analysis was used to filter the final results between the identified paragraphs. In this step, our approach doesn't use a machine translation system to convert longer text, but uses a dictionary to obtain the translation of a single word. Moreover, experimental results show that our method outperforms the previous methods and achieved the best results in four datasets.
引用
下载
收藏
页数:7
相关论文
共 50 条
  • [21] Mispronunciation detection based on cross-language phonological comparisons
    Wang, Lan
    Feng, Xin
    Meng, Helen M.
    2008 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING, VOLS 1 AND 2, PROCEEDINGS, 2008, : 307 - 311
  • [22] Cross-language text alignment: A proposed two-level matching scheme for plagiarism detection
    Roostaee, Meysam
    Fakhrahmad, Seyed Mostafa
    Sadreddini, Mohammad Hadi
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 160
  • [23] A model based transformation paradigm for cross-language collaborations
    Wen, Kunmei
    Tan, Suo
    Wang, Jie
    Li, Ruixuan
    Gao, Yuan
    ADVANCED ENGINEERING INFORMATICS, 2013, 27 (01) : 27 - 37
  • [24] CROSS-LANGUAGE PHRASE BOUNDARY DETECTION
    Soto, Victor
    Cooper, Erica
    Rosenberg, Andrew
    Hirschberg, Julia
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8460 - 8464
  • [25] Cross-Language Fake News Detection
    Chu S.K.W.
    Xie R.
    Wang Y.
    Data and Information Management, 2021, 5 (01) : 100 - 109
  • [26] Language based plagiarism detection
    Kaniski, Matija
    CENTRAL EUROPEAN CONFERENCE ON INFORMATION AND INTELLIGENT SYSTEMS (CECIIS 2016), 2016, : 207 - 212
  • [27] Flowchart-Based Cross-Language Source Code Similarity Detection
    Zhang, Feng
    Li, Guofan
    Liu, Cong
    Song, Qian
    SCIENTIFIC PROGRAMMING, 2020, 2020
  • [28] Structural and Nominal Cross-Language Clone Detection
    Nichols, Lawton
    Emre, Mehmet
    Hardekopf, Ben
    FUNDAMENTAL APPROACHES TO SOFTWARE ENGINEERING (FASE 2019), 2019, 11424 : 247 - 263
  • [29] LICCA: A Tool for Cross-Language Clone Detection
    Vislayski, Tijana
    Rakic, Gordana
    Cardozo, Nicolas
    Budimac, Zoran
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2018), 2018, : 512 - 516
  • [30] TCCCD: Triplet-Based Cross-Language Code Clone Detection
    Fang, Yong
    Zhou, Fangzheng
    Xu, Yijia
    Liu, Zhonglin
    APPLIED SCIENCES-BASEL, 2023, 13 (21):