Long-Tail Hashing

被引:6
|
作者
Chen, Yong [1 ,2 ]
Hou, Yuqing [3 ]
Leng, Shu [4 ]
Zhang, Qing [3 ]
Lin, Zhouchen [1 ,2 ]
Zhang, Dell [5 ,6 ]
机构
[1] Peking Univ, Sch EECS, Key Lab Machine Percept MoE, Beijing, Peoples R China
[2] Pazhou Lab, Guangzhou, Peoples R China
[3] Meituan, Beijing, Peoples R China
[4] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
[5] Blue Prism AI Labs, London, England
[6] Birkbeck Univ London, London, England
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
learning to hash; long-tail datasets; memory network; large-scale; multimedia retrieval; ITERATIVE QUANTIZATION; PROCRUSTEAN APPROACH; DISTRIBUTIONS; PARETO; CODES; SMOTE;
D O I
10.1145/3404835.3462888
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hashing, which represents data items as compact binary codes, has been becoming a more and more popular technique, e.g., for large-scale image retrieval, owing to its super fast search speed as well as its extremely economical memory consumption. However, existing hashing methods all try to learn binary codes from artificially balanced datasets which are not commonly available in real-world scenarios. In this paper, we propose Long-Tail Hashing Network (LTHNet), a novel two-stage deep hashing approach that addresses the problem of learning to hash for more realistic datasets where the data labels roughly exhibit a long-tail distribution. Specifically, the first stage is to learn relaxed embeddings of the given dataset with its long-tail characteristic taken into account via an end-to-end deep neural network; the second stage is to binarize those obtained embeddings. A critical part of LTHNet is its dynamic meta-embedding module extended with a determinantal point process which can adaptively realize visual knowledge transfer between head and tail classes, and thus enrich image representations for hashing. Our experiments have shown that LTHNet achieves dramatic performance improvements over all state-of-the-art competitors on long-tail datasets, with no or little sacrifice on balanced datasets. Further analyses reveal that while to our surprise directly manipulating class weights in the loss function has little effect, the extended dynamic meta-embedding module, the usage of cross-entropy loss instead of square loss, and the relatively small batch-size for training all contribute to LTHNet's success.
引用
下载
收藏
页码:1328 / 1338
页数:11
相关论文
共 50 条
  • [1] Long-Tail Cross Modal Hashing
    Gao, Zijun
    Wang, Jun
    Yu, Guoxian
    Yan, Zhongmin
    Domeniconi, Carlotta
    Zhang, Jinglin
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7642 - 7650
  • [2] On the Long-Tail Entities in News
    Esquivel, Jose
    Albakour, Dyaa
    Martinez, Miguel
    Corney, David
    Moussa, Samir
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2017, 2017, 10193 : 691 - 697
  • [3] THE TALE OF THE LONG-TAIL PAIR
    LIDGEY, J
    ELECTRONICS & WIRELESS WORLD, 1985, 91 (1595): : 74 - 76
  • [4] The Long-Tail Strategy for IT Outsourcing
    Su, Ning
    Levina, Natalia
    Ross, Jeanne W.
    MIT SLOAN MANAGEMENT REVIEW, 2016, 57 (02) : 81 - +
  • [5] DropLoss for Long-Tail Instance Segmentation
    Hsieh, Ting-, I
    Robb, Esther
    Chen, Hwann-Tzong
    Huang, Jia-Bin
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1549 - 1557
  • [6] Document Filtering for Long-tail Entities
    Reinanda, Ridho
    Meij, Edgar
    de Rijke, Maarten
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 771 - 780
  • [7] Quenched disorder and long-tail distributions
    Kleczkowski, A
    Góra, PF
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2003, 327 (3-4) : 378 - 398
  • [8] Knowledge Verification for Long-Tail Verticals
    Li, Furong
    Dong, Xin Luna
    Langen, Anno
    Li, Yang
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (11): : 1370 - 1381
  • [9] Physical space and long-tail markets
    Bentley, R. Alexander
    Madsen, Mark E.
    Ormerod, Paul
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2009, 388 (05) : 691 - 696
  • [10] Marketing in the era of long-tail media
    Rubinson, Joel
    JOURNAL OF ADVERTISING RESEARCH, 2008, 48 (03) : 301 - 302