Image retrieval;
Deep hashing;
Vision transformer;
New strengthened external attention;
New loss;
D O I:
10.1016/j.knosys.2023.111336
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
While transformers have indeed improved image retrieval accuracy in computer vision, challenges persist, including insufficient and imbalanced feature extraction and the inability to create compact binary codes. This study introduces a novel approach for image retrieval called Vision Transformer with Deep Hashing (VTDH), combining a hybrid neural network and optimized metric learning. Our work offers significant contributions, summarized as follows: We introduce an innovative Strengthened External Attention (NEA) module capable of simultaneous multi-scale feature focus and comprehensive global context assimilation. This enriches the model's comprehension of both overarching structure and semantics. Additionally, we propose a fresh balanced loss function to tackle the issue of imbalanced positive and negative samples within labels. Notably, this function employs sample labels as input, utilizing the mean value of all sample labels to quantify the frequency gap between positive and negative samples. This approach, combined with a customized balance weight, effectively addresses the challenge of label imbalance. Concurrently, we enhance the quantization loss function, intensifying its penalty for instances where the model's binary code output surpasses +/- 1. This reinforcement results in a more robust and stable hash code output. The proposed method is assessed on prominent datasets, including CIFAR-10, NUS-WIDE, and ImageNet. Experimental outcomes reveal superior retrieval accuracy compared to current state-of-the-art techniques. Notably, the VTDH model achieves an exceptional mean average precision (mAP) of 97.3% on the CIFAR-10 dataset.
机构:
Xi An Jiao Tong Univ, Qian Xuesen Program, Xian 710049, Peoples R ChinaXi An Jiao Tong Univ, Qian Xuesen Program, Xian 710049, Peoples R China
Zhai, Hongjia
Lai, Shenqi
论文数: 0引用数: 0
h-index: 0
机构:
Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Xian 710049, Peoples R China
Meituan, Dianping, Peoples R ChinaXi An Jiao Tong Univ, Qian Xuesen Program, Xian 710049, Peoples R China
Lai, Shenqi
Jin, Hanyang
论文数: 0引用数: 0
h-index: 0
机构:
Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Xian 710049, Peoples R ChinaXi An Jiao Tong Univ, Qian Xuesen Program, Xian 710049, Peoples R China
Jin, Hanyang
Qian, Xueming
论文数: 0引用数: 0
h-index: 0
机构:
Xi An Jiao Tong Univ, Sch Informat & Commun Engn, Key Lab Intelligent Networks & Network Secur, Minist Educ, Xian 710049, Peoples R China
Xi An Jiao Tong Univ, SMILES LAB, Xian 710049, Peoples R ChinaXi An Jiao Tong Univ, Qian Xuesen Program, Xian 710049, Peoples R China
Qian, Xueming
Mei, Tao
论文数: 0引用数: 0
h-index: 0
机构:
AI Res, JD Com, Beijing 101149, Peoples R ChinaXi An Jiao Tong Univ, Qian Xuesen Program, Xian 710049, Peoples R China