A systematic study of knowledge graph analysis for cross-language plagiarism detection

被引:64
|
作者
Franco-Salvador, Marc [1 ]
Rosso, Paolo [1 ]
Montes-y-Gomez, Manuel [2 ]
机构
[1] Univ Politecn Valencia, Pattern Recognit & Human Language Technol PRHLT R, Camino Vera S-N, E-46022 Valencia, Spain
[2] Inst Nacl Astrofis Opt & Electr, Dept Comp Sci, Luis Enrique Erro 1, Puebla 72840, Mexico
关键词
Cross-language; Plagiarism detection; Knowledge graphs; Multilingual semantic network; Distributed representations; Evaluation;
D O I
10.1016/j.ipm.2015.12.004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cross-language plagiarism detection aims to detect plagiarised fragments of text among documents in different languages. In this paper, we perform a systematic examination of Cross-language Knowledge Graph Analysis; an approach that represents text fragments using knowledge graphs as a language independent content model. We analyse the contributions to cross-language plagiarism detection of the different aspects covered by knowledge graphs: word sense disambiguation, vocabulary expansion, and representation by similarities with a collection of concepts. In addition, we study both the relevance of concepts and their relations when detecting plagiarism. Finally, as a key component of the knowledge graph construction, we present a new weighting scheme of relations between concepts based on distributed representations of concepts. Experimental results in Spanish-English and German-English plagiarism detection show state-of-the-art performance and provide interesting insights on the use of knowledge graphs. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:550 / 570
页数:21
相关论文
共 50 条
  • [1] Cross-language plagiarism detection
    Potthast, Martin
    Barron-Cedeno, Alberto
    Stein, Benno
    Rosso, Paolo
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2011, 45 (01) : 45 - 62
  • [2] Cross-language plagiarism detection
    Martin Potthast
    Alberto Barrón-Cedeño
    Benno Stein
    Paolo Rosso
    [J]. Language Resources and Evaluation, 2011, 45 : 45 - 62
  • [3] Graph-Based Similarity Analysis: A New Approach to Cross-Language Plagiarism Detection
    Franco-Salvador, Marc
    Gupta, Parth
    Rosso, Paolo
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2013, (50): : 21 - 28
  • [4] Methods for cross-language plagiarism detection
    Barron-Cedeno, Alberto
    Gupta, Parth
    Rosso, Paolo
    [J]. KNOWLEDGE-BASED SYSTEMS, 2013, 50 : 211 - 217
  • [5] Cross-language plagiarism detection over continuous-space- and knowledge graph-based representations of language
    Franco-Salvador, Marc
    Gupta, Parth
    Rosso, Paolo
    Banchs, Rafael E.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2016, 111 : 87 - 99
  • [6] A New Approach for Cross-Language Plagiarism Analysis
    Pereira, Rafael Corezola
    Moreira, Viviane P.
    Galante, Renata
    [J]. MULTILINGUAL AND MULTIMODAL INFORMATION ACCESS EVALUATION, 2010, 6360 : 15 - 26
  • [7] Cross-Language Plagiarism Detection Model Based On Multiple Features
    Liu, Gang
    Dong, Yichao
    Li, Guangxi
    [J]. 26TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2021), 2021,
  • [8] On the Mono- and Cross-Language Detection of Text Reuse and Plagiarism
    Barron-Cedeno, Alberto
    [J]. SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 914 - 914
  • [9] Meta-Analysis of Cross-Language Plagiarism and Self-Plagiarism Detection Methods for Russian-English Language Pair
    Tlitova, Alina
    Toschev, Alexander
    Talanov, Max
    Kurnosov, Vitaliy
    [J]. FRONTIERS IN COMPUTER SCIENCE, 2020, 2
  • [10] Word Embedding for High Performance Cross-Language Plagiarism Detection Techniques
    Bouaine, Chaimaa
    Benabbou, Faouzia
    Sadgali, Imane
    [J]. International Journal of Interactive Mobile Technologies, 2023, 17 (10): : 69 - 91