Effectively Searching Maps in Web Documents

被引:0
|
作者
Tan, Qingzhao [1 ]
Mitra, Prasenjit [1 ]
Giles, C. Lee [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Maps are an important source of information in archaeology and other sciences. Users want to search for historical maps to determine recorded history of the political geography of regions at different eras. to find out where exactly archaeological artifacts were discovered, etc. Currently, they have to use a generic search engine and add the term map along with other keywords to search for maps. This crude method will generate a significant number of false positives that the user will need to cull through to get the desired results. To reduce their manual effort. we propose an automatic map identification, indexing, and retrieval system that enables users to search and retrieve maps appearing in a large corpus of digital documents using simple keyword queries. We identify features that can help in distinguishing maps from other figures in digital documents and show how a Support-Vector-Machine-based classifier can be used to identify maps. We propose map-level-metadata e.g., captions, references to the maps in text, etc. and document-level metadata, e.g., title, abstract, citations, how recent the publication is, etc. and show how they can be automatically extracted and indexed. Our novel ranking algorithm weights different metadata fields differently and also uses the document-level metadata to help rank retrieved maps. Empirical evaluations show which features should be selected and which metadata fields should be weighted more. We also demonstrate improved retrieval results in comparison to adaptations of existing methods for map retrieval. Our map search engine has been deployed in an online map-search system that is part of the Blind-Review digital library system.
引用
收藏
页码:162 / +
页数:3
相关论文
共 50 条
  • [1] Searching web documents using a summarization approach
    Qumsiyeh, Rani
    Ng, Yiu-Kai
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2016, 12 (01) : 83 - 101
  • [2] Searching semantic web documents based on RDF sentences
    Wu, Honghan
    Qu, Yuzhong
    Li, Huiying
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2010, 47 (02): : 255 - 263
  • [3] SWEE: Approximately searching web service with keywords effectively and efficiently
    Qin, Zuoyan
    Li, Peng
    Zhu, Qing
    Tian, Chao
    2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 2, 2010, : 569 - 574
  • [4] Hybrid transformation for indexing and searching web documents in the cartographic paradigm
    Lee, F
    Bressan, S
    Ooi, BC
    INFORMATION SYSTEMS, 2001, 26 (02) : 75 - 92
  • [5] Social Information Retrieval Systems: Emerging Technologies and Applications for Searching the Web Effectively
    Chowdhury, Gobinda
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (12): : 2587 - 2588
  • [7] Searching Medline effectively
    Roy, N
    Sunil, U
    NATIONAL MEDICAL JOURNAL OF INDIA, 2001, 14 (02): : 106 - 111
  • [8] Searching structured documents
    Trotman, A
    INFORMATION PROCESSING & MANAGEMENT, 2004, 40 (04) : 619 - 632
  • [9] Searching Harsh Documents
    Frieder, Ophir
    PROCEEDINGS OF THE 21ST ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG '21), 2021,
  • [10] Searching bibliographic databases effectively
    Eyers, JE
    HEALTH POLICY AND PLANNING, 1998, 13 (03) : 339 - 342