Deep hashing image retrieval based on hybrid neural network and optimized metric learning

被引:3
|
作者
Xiao, Xingming [1 ]
Cao, Shu [2 ]
Wang, Liejun [1 ]
Cheng, Shuli [1 ]
Yuan, Erdong [1 ]
机构
[1] Xinjiang Univ, Sch Comp Sci & Technol, Urumqi 830046, Peoples R China
[2] State Grid Xinjiang Elect Power Co, Informat & Commun Co, Urumqi 830063, Peoples R China
基金
美国国家科学基金会;
关键词
Image retrieval; Deep hashing; Vision transformer; New strengthened external attention; New loss;
D O I
10.1016/j.knosys.2023.111336
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While transformers have indeed improved image retrieval accuracy in computer vision, challenges persist, including insufficient and imbalanced feature extraction and the inability to create compact binary codes. This study introduces a novel approach for image retrieval called Vision Transformer with Deep Hashing (VTDH), combining a hybrid neural network and optimized metric learning. Our work offers significant contributions, summarized as follows: We introduce an innovative Strengthened External Attention (NEA) module capable of simultaneous multi-scale feature focus and comprehensive global context assimilation. This enriches the model's comprehension of both overarching structure and semantics. Additionally, we propose a fresh balanced loss function to tackle the issue of imbalanced positive and negative samples within labels. Notably, this function employs sample labels as input, utilizing the mean value of all sample labels to quantify the frequency gap between positive and negative samples. This approach, combined with a customized balance weight, effectively addresses the challenge of label imbalance. Concurrently, we enhance the quantization loss function, intensifying its penalty for instances where the model's binary code output surpasses +/- 1. This reinforcement results in a more robust and stable hash code output. The proposed method is assessed on prominent datasets, including CIFAR-10, NUS-WIDE, and ImageNet. Experimental outcomes reveal superior retrieval accuracy compared to current state-of-the-art techniques. Notably, the VTDH model achieves an exceptional mean average precision (mAP) of 97.3% on the CIFAR-10 dataset.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] A deep metric learning approach for histopathological image retrieval
    Yang, Pengshuai
    Zhai, Yupeng
    Li, Lin
    Lv, Hairong
    Wang, Jigang
    Zhu, Chengzhan
    Jiang, Rui
    METHODS, 2020, 179 : 14 - 25
  • [42] Joint learning based deep supervised hashing for large-scale image retrieval
    Gu, Guanghua
    Liu, Jiangtao
    Li, Zhuoyi
    Huo, Wenhua
    Zhao, Yao
    NEUROCOMPUTING, 2020, 385 : 348 - 357
  • [43] Distribution Structure Learning Loss (DSLL) Based on Deep Metric Learning for Image Retrieval
    Fan, Lili
    Zhao, Hongwei
    Zhao, Haoyu
    Liu, Pingping
    Hu, Huangshui
    ENTROPY, 2019, 21 (11)
  • [44] Deep Transfer Hashing for Image Retrieval
    Zhai, Hongjia
    Lai, Shenqi
    Jin, Hanyang
    Qian, Xueming
    Mei, Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (02) : 742 - 753
  • [45] Hierarchical deep hashing for image retrieval
    Ge Song
    Xiaoyang Tan
    Frontiers of Computer Science, 2017, 11 : 253 - 265
  • [46] Deep Progressive Hashing for Image Retrieval
    Bai, Jiale
    Ni, Bingbing
    Wang, Minsi
    Li, Zefan
    Cheng, Shuo
    Yang, Xiaokang
    Hu, Chuanping
    Gao, Wen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (12) : 3178 - 3193
  • [47] Deep forest hashing for image retrieval
    Zhou, Meng
    Zeng, Xianhua
    Chen, Aozhu
    PATTERN RECOGNITION, 2019, 95 : 114 - 127
  • [48] Supervised learning based discrete hashing for image retrieval
    Ma, Qing
    Bai, Cong
    Zhang, Jinglin
    Liu, Zhi
    Chen, Shengyong
    PATTERN RECOGNITION, 2019, 92 : 156 - 164
  • [49] Hierarchical deep hashing for image retrieval
    Song, Ge
    Tan, Xiaoyang
    FRONTIERS OF COMPUTER SCIENCE, 2017, 11 (02) : 253 - 265
  • [50] Deep Progressive Hashing for Image Retrieval
    Bai, Jiale
    Ni, Bingbing
    Wang, Minsi
    Shen, Yang
    Lai, Hanjiang
    Zhang, Chongyang
    Mei, Lin
    Hu, Chuanping
    Yao, Chen
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 208 - 216