Reuse and plagiarism in Speech and Natural Language Processing publications

被引：5

作者：

Mariani, Joseph ^{[1
]}

Francopoulo, Gil ^{[1
,2
]}

Paroubek, Patrick ^{[1
]}

机构：

[1] Univ Paris Saclay, CNRS, LIMSI, Orsay, France

[2] Tagmatica, Paris, France

来源：

INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES | 2018年 / 19卷 / 2-3期

关键词：

Plagiarism; Detection; Text reuse; Natural Language Processing; Speech Processing; Scientometrics; Informetrics;

D O I：

10.1007/s00799-017-0211-0

中图分类号：

G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];

学科分类号：

1205 ; 120501 ;

摘要：

The aim of this experiment is to present an easy way to compare fragments of texts in order to detect (supposed) results of copy and paste operations between articles in the domain of Natural Language Processing (NLP), including Speech Processing. The search space of the comparisons is a corpus labeled as NLP4NLP, which includes 34 different conferences and journals and gathers a large part of the NLP activity over the past 50 years. This study considers the similarity between the papers of each individual event and the complete set of papers in the whole corpus, according to four different types of relationship (self-reuse, self-plagiarism, reuse and plagiarism) and in both directions: a paper borrowing a fragment of text from another paper of the corpus (that we will call the source paper), or in the reverse direction, fragments of text from the source paper being borrowed and inserted in another paper of the corpus. The results show that self-reuse is rather a common practice, but that plagiarism seems to be very unusual, and that both stay within legal and ethical limits.

引用

页码：113 / 126

页数：14

共 50 条

[1] Measuring Innovation in Speech and Language Processing Publications
Mariani, Joseph
Francopoulo, Gil
Paroubek, Patrick
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1890 - 1895
[2] Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition
Teller, V
COMPUTATIONAL LINGUISTICS, 2000, 26 (04) : 638 - 641
[3] The State of Profanity Obfuscation in Natural Language Processing Scientific Publications
Nozza, Debora
Hovy, Dirk
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3897 - 3909
[4] TextProc - a natural language processing framework and its use as plagiarism detection system
Brezovnik, Janez
Ojstersek, Milan
INTERNATIONAL JOURNAL OF EDUCATION AND INFORMATION TECHNOLOGIES, 2011, 5 (03): : 293 - 300
[5] Translating Speech to Indian Sign Language Using Natural Language Processing
Sharma, Purushottam
Tulsian, Devesh
Verma, Chaman
Sharma, Pratibha
Nancy, Nancy
FUTURE INTERNET, 2022, 14 (09)
[6] Potential of natural language processing for metadata extraction fromenvironmental scientific publications
Blanchy, Guillaume
Albrecht, Lukas
Koestel, John
Garre, Sarah
SOIL, 2023, 9 (01) : 155 - 168
[7] Incident Management Optimization through the Reuse of Experiences and Natural Language Processing
Vieira Bezerra, Glauber de Tarso
Monteiro Pinheiro, Vladia Celia
Albuquerque, Adriano Bessa
2014 9TH INTERNATIONAL CONFERENCE ON THE QUALITY OF INFORMATION AND COMMUNICATIONS TECHNOLOGY (QUATIC), 2014, : 247 - 254
[8] Incident Management Optimization through the Reuse of Experiences and Natural Language Processing
Bezerra, Glauber
Pinheiro, Vladia
Bessa, Adriano
2014 9TH INTERNATIONAL CONFERENCE ON THE QUALITY OF INFORMATION AND COMMUNICATIONS TECHNOLOGY (QUATIC), 2014, : 58 - 65
[9] FarSpeech: Arabic Natural Language Processing for Live Arabic Speech
Eldesouki, Mohamed
Gopee, Naassih
Ali, Ahmed
Darwish, Kareem
INTERSPEECH 2019, 2019, : 2372 - 2373
[10] Towards Natural Language Processing with Figures of Speech in Hindi Poetry
Audichya, Milind Kumar
Saini, Jatinderkumar R.
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 128 - 133

← 1 2 3 4 5 →