Robust Toponym Resolution Based on Surface Statistics

被引:1
|
作者
Sano, Tomohisa [1 ]
Nobesawa, Shiho Hoshi [2 ]
Okamoto, Hiroyuki [1 ]
Susuki, Hiroya [1 ]
Matsubara, Masaki [1 ]
Saito, Hiroaki [1 ]
机构
[1] Keio Univ, Yokohama, Kanagawa 2238522, Japan
[2] Tokyo City Univ, Tokyo 1588557, Japan
关键词
natural language processing; toponym resolution; area identification; statistical information;
D O I
10.1587/transinf.E92.D.2313
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Toponyms and other named entities are main issues in unknown word processing problem. Our purpose is to salvage unknown toponyms, not only for avoiding noises but also providing them information of area candidates to where they may belong. Most of previous toponym resolution methods were targeting disambiguation among area candidates. which is caused by the multiple existence of a toponym. These approaches were mostly based on gazetteers and contexts. When it comes to the documents which may contain toponyms worldwide, like newspaper articles, toponym resolution is not just an ambiguity resolution, but an area candidate selection from all the areas on Earth. Thus we propose an automatic toponym resolution method which enables to identify its, area candidates based only on their Surface statistics, in place of dictionary-lookup approaches. Our method combines two modules, area candidate reduction and area candidate examination which uses block-unit data, to obtain high accuracy without reducing recall rate. Our empirical result showed 85.54% precision rate, 91.92% recall rate and .89 F-measure value on average. This method is it flexible and robust approach for toponym resolution targeting unrestricted number of areas.
引用
收藏
页码:2313 / 2320
页数:8
相关论文
共 50 条
  • [21] Nonlinear image restoration methods based on robust statistics
    Jia, DY
    Huang, FG
    Su, H
    PROCEEDINGS OF THE 8TH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1-3, 2005, : 562 - 565
  • [22] Classification based on fast and robust approximations to order statistics
    Palm, Hans Christian
    PATTERN RECOGNITION AND TRACKING XXXI, 2020, 11400
  • [23] Robust estimation algorithm based on prior probability statistics
    Wang, Jing
    Hao, Gang
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (16) : 7957 - 7970
  • [24] ROBUST CHANGEPOINT DETECTION BASED ON MULTIVARIATE RANK STATISTICS
    Lung-Yut-Fong, Alexandre
    Levy-Leduc, Celine
    Cappe, Olivier
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 3608 - 3611
  • [25] A robust optimization using the statistics based on kriging metamodel
    Lee, Kwon-Hee
    Kang, Dong-Heon
    JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2006, 20 (08) : 1169 - 1182
  • [26] A robust optimization using the statistics based on kriging metamodel
    Kwon-Hee Lee
    Dong-Heon Kang
    Journal of Mechanical Science and Technology, 2006, 20
  • [27] Memory-based concordancer for Mongolian toponym
    Jaimai, Purev
    Chimeddorj, Odbayar
    ALPIT 2007: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, 2007, : 151 - +
  • [28] An Evidence-based Approach for Toponym Disambiguation
    Wang, Xingguang
    Zhang, Yi
    Chen, Min
    Lin, Xing
    Yu, Hao
    Liu, Yu
    2010 18TH INTERNATIONAL CONFERENCE ON GEOINFORMATICS, 2010,
  • [29] Optical flow estimation and segmentation through surface fitting and robust statistics
    Yan, HS
    Tjahjadi, T
    2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 1390 - 1395
  • [30] A survey of robust statistics
    Stephan Morgenthaler
    Statistical Methods and Applications, 2007, 16 : 171 - 172