UMUCorpusClassifier: Compilation and evaluation of linguistic corpus for Natural Language Processing tasks

被引：14

作者：

Antonio Garcia-Diaz, Jose ^{[1
]}

Almela, Angela ^{[2
]}

Alcaraz-Marmol, Gema ^{[3
]}

Valencia-Garcia, Rafael ^{[1
]}

机构：

[1] Univ Murcia, Fac Informat, Murcia, Spain

[2] Univ Murcia, Fac Letras, Murcia, Spain

[3] Univ Castilla La Mancha, Dept Filol Moderna, Ciudad Real, Spain

来源：

PROCESAMIENTO DEL LENGUAJE NATURAL | 2020年 / 65期

关键词：

Corpus compilation; Document classification;

D O I：

10.26342/2020-65-22

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The development of an annotated corpus is a very time-consuming task. Although some researchers have proposed the automatic annotation of a corpus based on ad-hoc heuristics, valid hypotheses cannot always be made. Even when the annotation process is performed by human annotators, the quality of the corpus is heavily influenced by disagreements between annotators or with themselves. Therefore, the lack of supervision of the annotation process can lead to poor quality corpus. In this work, we propose a demonstration of UMUCorpusClassifier, a NLP tool for aid researches for compiling corpus as well as coordinating and supervising the annotation process. This tool eases the daily supervision process and permits to detect deviations and inconsistencies during early stages of the annotation process.

引用

页码：139 / 142

页数：4

共 50 条

[1] Natural Language Processing for Corpus Linguistics
Schmuck, Hanna
Dunn, J.
INTERNATIONAL JOURNAL OF CORPUS LINGUISTICS, 2024, 29 (01) : 123 - 129
[2] Natural Language Processing and Linguistic Fieldwork
Bird, Steven
COMPUTATIONAL LINGUISTICS, 2009, 35 (03) : 469 - 474
[3] Linguistic Diversity in Natural Language Processing
Ranta, Aarne
Goutte, Cyril
TRAITEMENT AUTOMATIQUE DES LANGUES, 2021, 62 (03): : 7 - 11
[4] Linguistic typology in natural language processing
Bender, Emily M.
LINGUISTIC TYPOLOGY, 2016, 20 (03) : 645 - 660
[5] Review of Natural Language Processing for Corpus Linguistics
Zhao, Qiuying
CORPUS PRAGMATICS, 2022, 6 (04) : 311 - 314
[6] Natural language processing for learner corpus research
Kyle, Kristopher
INTERNATIONAL JOURNAL OF LEARNER CORPUS RESEARCH, 2021, 7 (01) : 1 - 16
[7] Natural language processing as a source of linguistic knowledge
Iomdin, LL
MLMTA'03: INTERNATIONAL CONFERENCE ON MACHINE LEARNING; MODELS, TECHNOLOGIES AND APPLICATIONS, 2003, : 68 - 74
[8] Empirical learning of natural language processing tasks
Daelemans, W
van den Bosch, A
Weijters, T
MACHINE LEARNING : ECML-97, 1997, 1224 : 337 - 344
[9] LINGUISTIC ASPECTS OF NATURAL-LANGUAGE PROCESSING
HAJICOVA, E
LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, 1992, 617 : 477 - 484
[10] Constraints, linguistic theories and natural language processing
Blache, P
NATURAL LANGUAGE PROCESSING-NLP 2000, PROCEEDINGS, 2000, 1835 : 221 - 232

← 1 2 3 4 5 →