A pragmatic guide to geoparsing evaluationToponyms, Named Entity Recognition and pragmatics

被引：0

作者：

Milan Gritta

Mohammad Taher Pilehvar

Nigel Collier

机构：

[1] University of Cambridge,Language Technology Lab (LTL), Department of Theoretical and Applied Linguistics (DTAL)

来源：

Language Resources and Evaluation | 2020年 / 54卷

关键词：

Geoparsing; Toponym resolution; Geotagging; Geocoding; Named Entity Recognition; Machine learning; Evaluation framework; Geonames; Toponyms; Natural language understanding; Pragmatics;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems. Evaluation is further made inconsistent, even unrepresentative of real world usage by the lack of distinction between the different types of toponyms, which necessitates new guidelines, a consolidation of metrics and a detailed toponym taxonomy with implications for Named Entity Recognition (NER) and beyond. To address these deficiencies, our manuscript introduces a new framework in three parts. (Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms. (Part 2) Metrics: discussed and reviewed for a rigorous evaluation including recommendations for NER/Geoparsing practitioners. (Part 3) Evaluation data: shared via a new dataset called GeoWebNews to provide test/train examples and enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping and evaluating machine learning NLP models.

引用

页码：683 / 712

页数：29

共 50 条

[41] Named Entity Recognition in the Moroccan Dialect
Al Akhawayn University in Ifrane, School of Science and Engineering, Ifrane, Morocco
Colloq. Inform. Sci. Technol., CIST, (282-286):
[42] Named Entity Recognition (NER) for Nepali
Maharjan, Gopal
Bal, Bal Krishna
Regmi, Santosh
CREATIVITY IN INTELLIGENT TECHNOLOGIES AND DATA SCIENCE, PT II, 2019, 1084 : 71 - 80
[43] TNNT: The Named Entity Recognition Toolkit
Seneviratne, Sandaru
Mendez, Sergio J. Rodriguez
Zhang, Xuecheng
Omran, Pouya G.
Taylor, Kerry
Haller, Armin
PROCEEDINGS OF THE 11TH KNOWLEDGE CAPTURE CONFERENCE (K-CAP '21), 2021, : 249 - 252
[44] Multilingual Transformers for Named Entity Recognition
Viksna, Rinalds
Skadin, Inguna
BALTIC JOURNAL OF MODERN COMPUTING, 2022, 10 (03): : 457 - 469
[45] Boundary Smoothing for Named Entity Recognition
Zhu, Enwei
Li, Jinpeng
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7096 - 7108
[46] A survey of named entity recognition and classification
Nadeau, David
Sekine, Satoshi
LINGUISTICAE INVESTIGATIONES, 2007, 30 (01): : 3 - 26
[47] A Named Entity Recognition system for Dutch
De Meulder, F
Daelemans, W
Hoste, V
COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS 2001, 2002, (45): : 77 - 88
[48] A Named Entity Recognition Shootout for German
Riedl, Martin
Pado, Sebastian
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 120 - 125
[49] Named Entity Recognition and transliteration in Bengali
Ekbal, Asif
Naskar, Sudip Kumar
Bandyopadhyay, Sivaji
LINGUISTICAE INVESTIGATIONES, 2007, 30 (01): : 95 - 114
[50] Nested Named Entity Recognition: A Survey
Wang, Yu
Tong, Hanghang
Zhu, Ziye
Li, Yun
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (06)

← 1 2 3 4 5 →