Deep hashing image retrieval based on hybrid neural network and optimized metric learning

被引:3
|
作者
Xiao, Xingming [1 ]
Cao, Shu [2 ]
Wang, Liejun [1 ]
Cheng, Shuli [1 ]
Yuan, Erdong [1 ]
机构
[1] Xinjiang Univ, Sch Comp Sci & Technol, Urumqi 830046, Peoples R China
[2] State Grid Xinjiang Elect Power Co, Informat & Commun Co, Urumqi 830063, Peoples R China
基金
美国国家科学基金会;
关键词
Image retrieval; Deep hashing; Vision transformer; New strengthened external attention; New loss;
D O I
10.1016/j.knosys.2023.111336
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While transformers have indeed improved image retrieval accuracy in computer vision, challenges persist, including insufficient and imbalanced feature extraction and the inability to create compact binary codes. This study introduces a novel approach for image retrieval called Vision Transformer with Deep Hashing (VTDH), combining a hybrid neural network and optimized metric learning. Our work offers significant contributions, summarized as follows: We introduce an innovative Strengthened External Attention (NEA) module capable of simultaneous multi-scale feature focus and comprehensive global context assimilation. This enriches the model's comprehension of both overarching structure and semantics. Additionally, we propose a fresh balanced loss function to tackle the issue of imbalanced positive and negative samples within labels. Notably, this function employs sample labels as input, utilizing the mean value of all sample labels to quantify the frequency gap between positive and negative samples. This approach, combined with a customized balance weight, effectively addresses the challenge of label imbalance. Concurrently, we enhance the quantization loss function, intensifying its penalty for instances where the model's binary code output surpasses +/- 1. This reinforcement results in a more robust and stable hash code output. The proposed method is assessed on prominent datasets, including CIFAR-10, NUS-WIDE, and ImageNet. Experimental outcomes reveal superior retrieval accuracy compared to current state-of-the-art techniques. Notably, the VTDH model achieves an exceptional mean average precision (mAP) of 97.3% on the CIFAR-10 dataset.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Medical image retrieval based on deep hashing
    Yan, Longquan
    Shi, Wei
    DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 491 - 491
  • [22] DEEP SELF-LEARNING HASHING FOR IMAGE RETRIEVAL
    Zhan, Jiawei
    Mo, Zhaoguo
    Zhu, Yuesheng
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1556 - 1560
  • [23] Liver Histopathological Image Retrieval Based on Deep Metric Learning
    Yang, Pengshuai
    Zhai, Yupeng
    Li, Lin
    Lv, Hairong
    Wang, Jigang
    Zhu, Chengzhan
    Jiang, Rui
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 914 - 919
  • [24] Deep triplet hashing network for case-based medical image retrieval
    Fang, Jiansheng
    Fu, Huazhu
    Liu, Jiang
    MEDICAL IMAGE ANALYSIS, 2021, 69
  • [25] A retrieval algorithm for encrypted speech based on convolutional neural network and deep hashing
    Zhang, Qiu-yu
    Li, Yu-zhou
    Hu, Ying-jie
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (01) : 1201 - 1221
  • [26] A retrieval algorithm for encrypted speech based on convolutional neural network and deep hashing
    Qiu-yu Zhang
    Yu-zhou Li
    Ying-jie Hu
    Multimedia Tools and Applications, 2021, 80 : 1201 - 1221
  • [27] Image Retrieval Based on Convolutional Neural Network and Kernel-Based Supervised Hashing
    Peng, Tianqiang
    Zhao, Yongwei
    Ke, Shengcai
    2015 8TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP), 2015, : 544 - 549
  • [28] A Hashing Image Retrieval Method Based on Deep Learning and Local Feature Fusion
    Nie, Yi-Liang
    Du, Ji-Xiang
    Fan, Wen-Tao
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2017, PT I, 2017, 10361 : 200 - 210
  • [29] Deep Learning-Based Image Retrieval With Unsupervised Double Bit Hashing
    Guo, Jing-Ming
    Prayuda, Alim Wicaksono Hari
    Prasetyo, Heri
    Seshathiri, Sankarasrinivasan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 7050 - 7065
  • [30] Optimized Deep-Neural Network for Content-based Medical Image Retrieval in a Brownfield IoMT Network
    Tiwari, Arti
    Pant, Millie
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (02)