Self-Taught Hashing for Fast Similarity Search

被引:0
|
作者
Zhang, Dell [1 ]
Wang, Jun [1 ]
Cal, Deng [1 ]
Lu, Jinsong [1 ]
机构
[1] Univ London, DCSIS, London WC1E 7HX, England
关键词
Similarity Search; Semantic Hashing; Laplacian Eigenmap; Support Vector Machine; DIMENSIONALITY;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The ability of fast similarity search at large scale is of great importance to many Information Retrieval (IR) applications. A promising way to accelerate similarity search is semantic hashing which designs compact binary codes for a large number of documents so that semantically similar documents are mapped to similar codes (within a short Hamming distance). Although some recently proposed techniques are able to generate high-quality codes for documents known in advance, obtaining the codes for previously unseen documents remains to be a very challenging problem. In this paper, we emphasise this issue and propose a novel Self-Taught Hashing (STH) approach to semantic hashing: we first find the optimal l-bit binary codes for all documents in the given corpus via unsupervised learning, and then train 1 classifiers via supervised learning to predict the l-bit code for any query document unseen before. Our experiments on three real-world text datasets show that the proposed approach using binarised Laplacian Eigenmap (LapEig) and linear Support Vector Machine (SVM) outperforms state-of-the-art techniques significantly.
引用
收藏
页码:18 / 25
页数:8
相关论文
共 50 条
  • [1] Deep Self-Taught Hashing for Image Retrieval
    Liu, Yu
    Song, Jingkuan
    Zhou, Ke
    Yan, Lingyu
    Liu, Li
    Zou, Fuhao
    Shao, Ling
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (06) : 2229 - 2241
  • [2] Deep Self-taught Hashing for Image Retrieval
    Zhou, Ke
    Liu, Yu
    Song, Jinkuan
    Yan, Linyu
    Zou, Fuhao
    Shen, Fumin
    [J]. MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1215 - 1218
  • [3] 'Self-taught'
    Parry, M
    [J]. FIDDLEHEAD, 2005, (226): : 15 - 15
  • [4] Adaptive Hashing for Fast Similarity Search
    Cakir, Fatih
    Sclaroff, Stan
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1044 - 1052
  • [5] Malay Self-Taught
    Lowther, William E.
    [J]. MODERN LANGUAGE JOURNAL, 1946, 30 (04): : 228 - 228
  • [6] Tamil Self-Taught
    不详
    [J]. SCOTTISH GEOGRAPHICAL MAGAZINE, 1908, 24 (01): : 54 - 54
  • [7] A SELF-TAUGHT BIOLOGIST
    ROTHSCHILD, M
    [J]. GENETIC ENGINEER AND BIOTECHNOLOGIST, 1991, 11 (02): : 9 - 9
  • [8] DEEP SELF-TAUGHT GRAPH EMBEDDING HASHING WITH PSEUDO LABELS FOR IMAGE RETRIEVAL
    Liu, Yu
    Wang, Yangtao
    Song, Jingkuan
    Guo, Chan
    Zhou, Ke
    Xiao, Zhili
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [9] System is self-taught
    不详
    [J]. NUCLEAR ENGINEERING INTERNATIONAL, 2000, 45 (553): : 25 - 25
  • [10] MOTIVATED AND SELF-TAUGHT
    ANDERSON, RW
    [J]. ELECTRONIC DESIGN, 1979, 27 (05) : 11 - 11