HashFormer: Vision Transformer Based Deep Hashing for Image Retrieval

被引:34
|
作者
Li, Tao [1 ]
Zhang, Zheng [1 ]
Pei, Lishen [2 ]
Gan, Yan [3 ]
机构
[1] Open Univ Henan, Zhengzhou 450046, Peoples R China
[2] Henan Univ Econ & Law, Zhengzhou 450046, Peoples R China
[3] Chongqing Univ, Chongqing 400044, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Transformers; Binary codes; Task analysis; Training; Image retrieval; Feature extraction; Databases; Binary embedding; image retrieval;
D O I
10.1109/LSP.2022.3157517
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep image hashing aims to map an input image to compact binary codes by deep neural network, to enable efficient image retrieval across large-scale dataset. Due to the explosive growth of modern data, deep hashing has gained growing attention from research community. Recently, convolutional neural networks like ResNet have dominated in deep hashing. Nevertheless, motivated by the recent advancements of vision transformers, we propose a pure transformer-based framework, called as HashFormer, to tackle the deep hashing task. Specifically, we utilize vision transformer (ViT) as our backbone, and treat binary codes as the intermediate representations for our surrogate task, i.e., image classification. In addition, we observe that the binary codes suitable for classification are sub-optimal for retrieval. To mitigate this problem, we present a novel average precision loss, which enables us to directly optimize the retrieval accuracy. To the best of our knowledge, our work is one of the pioneer works to address deep hashing learning problems without convolutional neural networks (CNNs). We perform comprehensive experiments on three widely-studied datasets: CIFAR-10, NUSWIDE and ImageNet. The proposed method demonstrates promising results against existing state-of-the-art works, validating the advantages and merits of our HashFormer.
引用
收藏
页码:827 / 831
页数:5
相关论文
共 50 条
  • [1] Contrastive hashing with vision transformer for image retrieval
    Ren, Xiuxiu
    Zheng, Xiangwei
    Zhou, Huiyu
    Liu, Weilong
    Dong, Xiao
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (12) : 12192 - 12211
  • [2] Deep Supervised Hashing Image Retrieval Method Based on Swin Transformer
    Miao Z.
    Zhao X.
    Li Y.
    Wang J.
    Zhang R.
    [J]. Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences, 2023, 50 (08): : 62 - 71
  • [3] Deep internally connected transformer hashing for image retrieval
    Chao, Zijian
    Cheng, Shuli
    Li, Yongming
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 279
  • [4] Medical image retrieval based on deep hashing
    Yan, Longquan
    Shi, Wei
    [J]. DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 491 - 491
  • [5] VTHSC-MIR: Vision Transformer Hashing with Supervised Contrastive learning based medical image retrieval
    Kumar, Mehul
    Singh, Rhythumwinder
    Mukherjee, Prerana
    [J]. PATTERN RECOGNITION LETTERS, 2024, 184 : 28 - 36
  • [6] Quadruplet-based deep hashing for image retrieval
    Zhu, Jie
    Chen, Zhipeng
    Zhao, Li
    Wu, Shufang
    [J]. NEUROCOMPUTING, 2019, 366 : 161 - 169
  • [7] Deep Hamming Embedding Based Hashing for Image Retrieval
    Lin J.
    Liu H.
    Zheng Z.
    [J]. Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2020, 33 (06): : 542 - 550
  • [8] Deep Transfer Hashing for Image Retrieval
    Zhai, Hongjia
    Lai, Shenqi
    Jin, Hanyang
    Qian, Xueming
    Mei, Tao
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (02) : 742 - 753
  • [9] Deep Progressive Hashing for Image Retrieval
    Bai, Jiale
    Ni, Bingbing
    Wang, Minsi
    Li, Zefan
    Cheng, Shuo
    Yang, Xiaokang
    Hu, Chuanping
    Gao, Wen
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (12) : 3178 - 3193
  • [10] Hierarchical deep hashing for image retrieval
    Ge Song
    Xiaoyang Tan
    [J]. Frontiers of Computer Science, 2017, 11 : 253 - 265