Robust Toponym Resolution Based on Surface Statistics

被引:1
|
作者
Sano, Tomohisa [1 ]
Nobesawa, Shiho Hoshi [2 ]
Okamoto, Hiroyuki [1 ]
Susuki, Hiroya [1 ]
Matsubara, Masaki [1 ]
Saito, Hiroaki [1 ]
机构
[1] Keio Univ, Yokohama, Kanagawa 2238522, Japan
[2] Tokyo City Univ, Tokyo 1588557, Japan
关键词
natural language processing; toponym resolution; area identification; statistical information;
D O I
10.1587/transinf.E92.D.2313
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Toponyms and other named entities are main issues in unknown word processing problem. Our purpose is to salvage unknown toponyms, not only for avoiding noises but also providing them information of area candidates to where they may belong. Most of previous toponym resolution methods were targeting disambiguation among area candidates. which is caused by the multiple existence of a toponym. These approaches were mostly based on gazetteers and contexts. When it comes to the documents which may contain toponyms worldwide, like newspaper articles, toponym resolution is not just an ambiguity resolution, but an area candidate selection from all the areas on Earth. Thus we propose an automatic toponym resolution method which enables to identify its, area candidates based only on their Surface statistics, in place of dictionary-lookup approaches. Our method combines two modules, area candidate reduction and area candidate examination which uses block-unit data, to obtain high accuracy without reducing recall rate. Our empirical result showed 85.54% precision rate, 91.92% recall rate and .89 F-measure value on average. This method is it flexible and robust approach for toponym resolution targeting unrestricted number of areas.
引用
收藏
页码:2313 / 2320
页数:8
相关论文
共 50 条
  • [1] TOPONYM RESOLUTION IN DISCOURSE
    Tang, Xuri
    Chen, Xiaohe
    Peng, Minxuan
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 1 - 8
  • [2] Toponym Resolution in Social Media
    Ireson, Neil
    Ciravegna, Fabio
    SEMANTIC WEB-ISWC 2010, PT I, 2010, 6496 : 370 - 385
  • [3] Deep Learning for Toponym Resolution: Geocoding Based on Pairs of Toponyms
    Fize, Jacques
    Moncla, Ludovic
    Martins, Bruno
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (12)
  • [4] An evaluation dataset for the toponym resolution task
    Leidner, Jochen L.
    COMPUTERS ENVIRONMENT AND URBAN SYSTEMS, 2006, 30 (04) : 400 - 417
  • [5] A Coherent Unsupervised Model for Toponym Resolution
    Kamalloo, Ehsan
    Rafiei, Davood
    WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 2018, : 1287 - 1296
  • [6] Research on Toponym resolution in Chinese text
    Tang, Xuri
    Chen, Xiaohe
    Zhang, Xueying
    Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University, 2010, 35 (08): : 930 - 935
  • [7] A survey on geocoding: algorithms and datasets for toponym resolution
    Zhang, Zeyu
    Bethard, Steven
    LANGUAGE RESOURCES AND EVALUATION, 2024,
  • [8] Unsupervised segmentation based on multi-resolution analysis, robust statistics and majority game theory
    Guo, GD
    Yu, S
    Ma, SD
    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 799 - 801
  • [9] A Feature-Preserving Filtering Algorithm for Point Set Surface and Surface Attributes Based on Robust Statistics
    Qin, Hong-Xing
    Yang, Jie
    Zhu, Yue-Min
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2009, 25 (04) : 1135 - 1146
  • [10] Using Recurrent Neural Networks for Toponym Resolution in Text
    Cardoso, Ana Barbara
    Martins, Bruno
    Estima, Jacinto
    PROGRESS IN ARTIFICIAL INTELLIGENCE, PT II, 2019, 11805 : 769 - 780