A pragmatic guide to geoparsing evaluationToponyms, Named Entity Recognition and pragmatics

被引:0
|
作者
Milan Gritta
Mohammad Taher Pilehvar
Nigel Collier
机构
[1] University of Cambridge,Language Technology Lab (LTL), Department of Theoretical and Applied Linguistics (DTAL)
来源
关键词
Geoparsing; Toponym resolution; Geotagging; Geocoding; Named Entity Recognition; Machine learning; Evaluation framework; Geonames; Toponyms; Natural language understanding; Pragmatics;
D O I
暂无
中图分类号
学科分类号
摘要
Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems. Evaluation is further made inconsistent, even unrepresentative of real world usage by the lack of distinction between the different types of toponyms, which necessitates new guidelines, a consolidation of metrics and a detailed toponym taxonomy with implications for Named Entity Recognition (NER) and beyond. To address these deficiencies, our manuscript introduces a new framework in three parts. (Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms. (Part 2) Metrics: discussed and reviewed for a rigorous evaluation including recommendations for NER/Geoparsing practitioners. (Part 3) Evaluation data: shared via a new dataset called GeoWebNews to provide test/train examples and enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping and evaluating machine learning NLP models.
引用
收藏
页码:683 / 712
页数:29
相关论文
共 50 条
  • [21] Named entity recognition without gazetteers
    Mikheev, A
    Moens, M
    Grover, C
    NINTH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS, 1999, : 1 - 8
  • [22] Named entity evolution recognition on the Blogosphere
    Holzmann, Helge
    Tahmasebi, Nina
    Risse, Thomas
    INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2015, 15 (2-4) : 209 - 235
  • [23] Chinese Governmental Named Entity Recognition
    Liu, Qi
    Wang, Dong
    Zhou, Meilin
    Li, Peng
    Qi, Baoyuan
    Bin Wang
    INFORMATION RETRIEVAL TECHNOLOGY (AIRS 2018), 2018, 11292 : 16 - 28
  • [24] Named Entity Recognition and Classification in Galician
    Garcia, Marcos
    Gayo, Iria
    Gonzalez Lopez, Isaac
    ESTUDOS DE LINGUISTICA GALEGA, 2012, 4 : 13 - 25
  • [25] Named entity recognition for the Kazakh language
    Kozhirbayev, Z. M.
    Yessenbayev, Z. A.
    JOURNAL OF MATHEMATICS MECHANICS AND COMPUTER SCIENCE, 2020, 107 (03): : 57 - 66
  • [26] Named Entity Recognition in Vietnamese Tweets
    Nguyen, Vu H.
    Nguyen, Hien T.
    Snasel, Vaclav
    COMPUTATIONAL SOCIAL NETWORKS, CSONET 2015, 2015, 9197 : 205 - 215
  • [27] Named Entity Recognition for Sinhala Language
    Dahanayaka, J. K.
    Weerasinghe, A. R.
    14TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER) 2014, 2014, : 215 - 220
  • [28] Named entity recognition in Vietnamese documents
    Tri Tran, Q.
    Thao Pham, T.X.
    Hung Ngo, Q.
    Dinh, Dien
    Collier, Nigel
    Progress in Informatics, 2007, (04): : 5 - 13
  • [29] Biomedical named entity recognition system
    Patrick, J. (jonpat@it.usyd.edu.au), 2005, School of Information Technologies
  • [30] A Contribution to Arabic Named Entity Recognition
    Koulali, Rim
    Meziane, Abdelouafi
    2012 TENTH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING, 2012, : 46 - 52