Robust Toponym Resolution Based on Surface Statistics

被引:1
|
作者
Sano, Tomohisa [1 ]
Nobesawa, Shiho Hoshi [2 ]
Okamoto, Hiroyuki [1 ]
Susuki, Hiroya [1 ]
Matsubara, Masaki [1 ]
Saito, Hiroaki [1 ]
机构
[1] Keio Univ, Yokohama, Kanagawa 2238522, Japan
[2] Tokyo City Univ, Tokyo 1588557, Japan
关键词
natural language processing; toponym resolution; area identification; statistical information;
D O I
10.1587/transinf.E92.D.2313
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Toponyms and other named entities are main issues in unknown word processing problem. Our purpose is to salvage unknown toponyms, not only for avoiding noises but also providing them information of area candidates to where they may belong. Most of previous toponym resolution methods were targeting disambiguation among area candidates. which is caused by the multiple existence of a toponym. These approaches were mostly based on gazetteers and contexts. When it comes to the documents which may contain toponyms worldwide, like newspaper articles, toponym resolution is not just an ambiguity resolution, but an area candidate selection from all the areas on Earth. Thus we propose an automatic toponym resolution method which enables to identify its, area candidates based only on their Surface statistics, in place of dictionary-lookup approaches. Our method combines two modules, area candidate reduction and area candidate examination which uses block-unit data, to obtain high accuracy without reducing recall rate. Our empirical result showed 85.54% precision rate, 91.92% recall rate and .89 F-measure value on average. This method is it flexible and robust approach for toponym resolution targeting unrestricted number of areas.
引用
收藏
页码:2313 / 2320
页数:8
相关论文
共 50 条
  • [31] Ambiguity and robust statistics
    Cerreia-Vioglio, Simone
    Maccheroni, Fabio
    Marinacci, Massimo
    Montrucchio, Luigi
    JOURNAL OF ECONOMIC THEORY, 2013, 148 (03) : 974 - 1049
  • [33] TUTORIAL TO ROBUST STATISTICS
    ROUSSEEUW, PJ
    JOURNAL OF CHEMOMETRICS, 1991, 5 (01) : 1 - 20
  • [34] A survey of robust statistics
    Morgenthaler S.
    Statistical Methods and Applications, 2007, 15 (3): : 271 - 293
  • [35] New Hausdorff distances based on robust statistics for comparing images
    Kwon, OK
    Sim, DG
    Park, RH
    INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, PROCEEDINGS - VOL III, 1996, : 21 - 24
  • [36] Pitch post-processing technique based on robust statistics
    Cho, YD
    Al-Naimi, K
    Kondoz, A
    ELECTRONICS LETTERS, 2002, 38 (20) : 1233 - 1234
  • [37] Robust Statistics-based Anomaly Detection in a Steel Industry
    Acernese, Antonio
    Sarda, Kisan
    Nole, Vittorio
    Manfredi, Leonardo
    Greco, Luca
    Glielmo, Luigi
    Del Vecchio, Carmen
    2021 29TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2021, : 1058 - 1063
  • [38] A Robust Adaptive Weighted CFAR Detector Based on Truncated Statistics
    Xie, Renhong
    Wei, Junfeng
    Wang, Xing
    Dong, Bohao
    Li, Peng
    Rui, Yibin
    TWELFTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING SYSTEMS, 2021, 11719
  • [39] Nonlinear diffusion methods based on robust statistics for noise removal
    Jia, Di-Ye
    Huang, Feng-Gang
    Su, Han
    Journal of Harbin Institute of Technology (New Series), 2007, 14 (03) : 440 - 444
  • [40] A robust procedure for temperature field analysis based on order statistics
    Matveev E.L.
    Matveev A.L.
    Mishenin A.Y.
    Journal of Machinery Manufacture and Reliability, 2013, 42 (2) : 171 - 178