Effectively Searching Maps in Web Documents

被引:0
|
作者
Tan, Qingzhao [1 ]
Mitra, Prasenjit [1 ]
Giles, C. Lee [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Maps are an important source of information in archaeology and other sciences. Users want to search for historical maps to determine recorded history of the political geography of regions at different eras. to find out where exactly archaeological artifacts were discovered, etc. Currently, they have to use a generic search engine and add the term map along with other keywords to search for maps. This crude method will generate a significant number of false positives that the user will need to cull through to get the desired results. To reduce their manual effort. we propose an automatic map identification, indexing, and retrieval system that enables users to search and retrieve maps appearing in a large corpus of digital documents using simple keyword queries. We identify features that can help in distinguishing maps from other figures in digital documents and show how a Support-Vector-Machine-based classifier can be used to identify maps. We propose map-level-metadata e.g., captions, references to the maps in text, etc. and document-level metadata, e.g., title, abstract, citations, how recent the publication is, etc. and show how they can be automatically extracted and indexed. Our novel ranking algorithm weights different metadata fields differently and also uses the document-level metadata to help rank retrieved maps. Empirical evaluations show which features should be selected and which metadata fields should be weighted more. We also demonstrate improved retrieval results in comparison to adaptations of existing methods for map retrieval. Our map search engine has been deployed in an online map-search system that is part of the Blind-Review digital library system.
引用
收藏
页码:162 / +
页数:3
相关论文
共 50 条
  • [21] Documents distribution strategy based on queuing model and chaotic searching algorithm in web server cluster
    Xiong, Zhi
    Guo, Chengcheng
    SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 1113 - +
  • [22] Web search: public searching of the Web
    Wilson, T
    INFORMATION RESEARCH-AN INTERNATIONAL ELECTRONIC JOURNAL, 2005, 10 (02):
  • [23] AN ALGORITHM FOR SEARCHING RESTRICTION MAPS
    MILLER, W
    OSTELL, J
    RUDD, KE
    COMPUTER APPLICATIONS IN THE BIOSCIENCES, 1990, 6 (03): : 247 - 252
  • [24] Strip-Searching for Nationality Documents
    Spalding, Amanda
    MODERN LAW REVIEW, 2021, 84 (03): : 456 - 476
  • [25] Searching documents based on relevance and type
    Xu, Jun
    Cao, Yunbo
    Li, Hang
    Craswell, Nick
    Huang, Yalou
    ADVANCES IN INFORMATION RETRIEVAL, 2007, 4425 : 629 - +
  • [26] SEARCHING JAPANESE SCIENTIFIC AND TECHNICAL DOCUMENTS
    PRUDNIKOV, LS
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1968, (05): : 35 - +
  • [27] A survey in indexing and searching XML documents
    Luk, RWP
    Leong, HV
    Dillon, TS
    Chan, ATS
    Croft, WB
    Allan, J
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2002, 53 (06): : 415 - 437
  • [28] Searching for Physical Documents in Archival Repositories
    Suzuki, Tokinori
    Oard, Douglas W.
    Ishita, Emi
    Tomiura, Yoichi
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2614 - 2618
  • [29] Searching for CAC-maps
    Hietarinta, J
    JOURNAL OF NONLINEAR MATHEMATICAL PHYSICS, 2005, 12 (Suppl 2) : 223 - 230
  • [30] Searching for CAC-maps
    Jarmo Hietarinta
    Journal of Nonlinear Mathematical Physics, 2005, 12 : 223 - 230