Two-Stage Hashing for Fast Document Retrieval

被引:0
|
作者
Li, Hao [1 ]
Liu, Wei [2 ]
Ji, Heng [1 ]
机构
[1] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This work fulfills sublinear time Nearest Neighbor Search (NNS) in massive-scale document collections. The primary contribution is to propose a two-stage unsupervised hashing framework which harmoniously integrates two state-of-the-art hashing algorithms Locality Sensitive Hashing (LSH) and Iterative Quantization (ITQ). LSH accounts for neighbor candidate pruning, while ITQ provides an efficient and effective reranking over the neighbor pool captured by LSH. Furthermore, the proposed hashing framework capitalizes on both term and topic similarity among documents, leading to precise document retrieval. The experimental results convincingly show that our hashing based document retrieval approach well approximates the conventional Information Retrieval (IR) method in terms of retrieving semantically similar documents, and meanwhile achieves a speedup of over one order of magnitude in query time.
引用
下载
收藏
页码:495 / 500
页数:6
相关论文
共 50 条
  • [11] Hidden semantic hashing for fast retrieval over large scale document collection
    Zou, Fuhao
    Tang, Xiaoman
    Li, Kai
    Wang, Yunfei
    Song, Jingkuan
    Yang, Shuangyuan
    Ling, Hefei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (03) : 3677 - 3697
  • [12] A two-stage binarization approach for document images
    Chi, Z
    Wong, KW
    PROCEEDINGS OF 2001 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2001, : 275 - 278
  • [13] A fast two-stage content-based image retrieval approach in the DCT domain
    Tsai, Tienwei
    Huang, Yo-Ping
    Chiang, Te-Wei
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2008, 22 (04) : 765 - 781
  • [14] WATCH: Two-Stage Discrete Cross-Media Hashing
    Zhang, Donglin
    Wu, Xiao-Jun
    Xu, Tianyang
    Kittler, Josef
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 6461 - 6474
  • [15] Two-stage method for specific audio retrieval
    Zhang, Wei-Qiang
    Liu, Jia
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 85 - +
  • [16] Two-Stage Query Segmentation for Information Retrieval
    Bendersky, Michael
    Croft, W. Bruce
    Smith, David A.
    PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 810 - 811
  • [17] Fast reconnection in a two-stage process
    Heitsch, F
    Zweibel, EG
    ASTROPHYSICAL JOURNAL, 2003, 583 (01): : 229 - 244
  • [18] Feature hashing for fast image retrieval
    Yan, Lingyu
    Fu, Jiarun
    Zhang, Hongxin
    Yuan, Lu
    Xu, Hui
    MIPPR 2017: PATTERN RECOGNITION AND COMPUTER VISION, 2017, 10609
  • [19] A New Two-Stage Hierarchical Framework for Mammogram Retrieval
    Wang, Weiwei
    Liu, Lihua
    Liu, Wei
    Xu, Weidong
    Zhang, Juan
    Shao, Guo-liang
    2009 3RD INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1-11, 2009, : 2211 - +
  • [20] Two-Stage Reranking for Remote Sensing Image Retrieval
    Tang, Xu
    Jiao, Licheng
    Emery, William J.
    Liu, Fang
    Zhang, Dan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2017, 55 (10): : 5798 - 5817