A pragmatic guide to geoparsing evaluationToponyms, Named Entity Recognition and pragmatics

被引:0
|
作者
Milan Gritta
Mohammad Taher Pilehvar
Nigel Collier
机构
[1] University of Cambridge,Language Technology Lab (LTL), Department of Theoretical and Applied Linguistics (DTAL)
来源
关键词
Geoparsing; Toponym resolution; Geotagging; Geocoding; Named Entity Recognition; Machine learning; Evaluation framework; Geonames; Toponyms; Natural language understanding; Pragmatics;
D O I
暂无
中图分类号
学科分类号
摘要
Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems. Evaluation is further made inconsistent, even unrepresentative of real world usage by the lack of distinction between the different types of toponyms, which necessitates new guidelines, a consolidation of metrics and a detailed toponym taxonomy with implications for Named Entity Recognition (NER) and beyond. To address these deficiencies, our manuscript introduces a new framework in three parts. (Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms. (Part 2) Metrics: discussed and reviewed for a rigorous evaluation including recommendations for NER/Geoparsing practitioners. (Part 3) Evaluation data: shared via a new dataset called GeoWebNews to provide test/train examples and enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping and evaluating machine learning NLP models.
引用
收藏
页码:683 / 712
页数:29
相关论文
共 50 条
  • [41] Named Entity Recognition in the Moroccan Dialect
    Al Akhawayn University in Ifrane, School of Science and Engineering, Ifrane, Morocco
    Colloq. Inform. Sci. Technol., CIST, (282-286):
  • [42] Named Entity Recognition (NER) for Nepali
    Maharjan, Gopal
    Bal, Bal Krishna
    Regmi, Santosh
    CREATIVITY IN INTELLIGENT TECHNOLOGIES AND DATA SCIENCE, PT II, 2019, 1084 : 71 - 80
  • [43] TNNT: The Named Entity Recognition Toolkit
    Seneviratne, Sandaru
    Mendez, Sergio J. Rodriguez
    Zhang, Xuecheng
    Omran, Pouya G.
    Taylor, Kerry
    Haller, Armin
    PROCEEDINGS OF THE 11TH KNOWLEDGE CAPTURE CONFERENCE (K-CAP '21), 2021, : 249 - 252
  • [44] Multilingual Transformers for Named Entity Recognition
    Viksna, Rinalds
    Skadin, Inguna
    BALTIC JOURNAL OF MODERN COMPUTING, 2022, 10 (03): : 457 - 469
  • [45] Boundary Smoothing for Named Entity Recognition
    Zhu, Enwei
    Li, Jinpeng
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7096 - 7108
  • [46] A survey of named entity recognition and classification
    Nadeau, David
    Sekine, Satoshi
    LINGUISTICAE INVESTIGATIONES, 2007, 30 (01): : 3 - 26
  • [47] A Named Entity Recognition system for Dutch
    De Meulder, F
    Daelemans, W
    Hoste, V
    COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS 2001, 2002, (45): : 77 - 88
  • [48] A Named Entity Recognition Shootout for German
    Riedl, Martin
    Pado, Sebastian
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 120 - 125
  • [49] Named Entity Recognition and transliteration in Bengali
    Ekbal, Asif
    Naskar, Sudip Kumar
    Bandyopadhyay, Sivaji
    LINGUISTICAE INVESTIGATIONES, 2007, 30 (01): : 95 - 114
  • [50] Nested Named Entity Recognition: A Survey
    Wang, Yu
    Tong, Hanghang
    Zhu, Ziye
    Li, Yun
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (06)