Deep Multi-Scale Attention Hashing Network for Large-Scale Image Retrieval

被引:0
|
作者
Feng H. [1 ,2 ]
Wang N. [2 ]
Tang J. [2 ]
机构
[1] School of Management Science and Engineering, Anhui University of Finance and Economics, Bengbu
[2] School of Electronics and Information Engineering, Anhui University, Hefei
基金
中国国家自然科学基金;
关键词
Attention; Deep learning; Hashing; Image retrieval; Quantization;
D O I
10.12141/j.issn.1000-565X.210268
中图分类号
学科分类号
摘要
Aiming at the limited feature extraction capability and inefficient quantization constraint mechanism of existing hashing methods, a deep multi-scale attention hashing network was proposed for large-scale image retrieval. The whole network was composed of a main branch and an object branch. In the main branch, two modules of multi-scale attention localization and saliency region extraction were added to effectively localize and extract saliency regions of images, and the results were fed into the object branch to learn more detailed features. Subsequently, the multi-granularity features learned by two branches were fused to perform binary hash coding. In addition, a triplet quantization constraint was introduced to reduce quantization error while maintaining the similarity relationship between sample pairs. In order to verify the effectiveness of the proposed method, extensive experiments were carried out on two benchmark datasets. Experimental results show that the proposed method outperforms most existing hashing retrieval methods. © 2022, Editorial Department, Journal of South China University of Technology. All right reserved.
引用
收藏
页码:35 / 45
页数:10
相关论文
共 29 条
  • [1] GIONIS A, INDYK P, MOTWANI R., Similarity search in high dimensions via hashing [C], Proceedings of the 25th International Conference on Very Large Data Bases, pp. 518-529, (1999)
  • [2] WANG J D, ZHANG T, SONG J K, Et al., A survey on learning to hash[J], Journal of IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 4, pp. 769-790, (2018)
  • [3] RAGINSKY M, LAZEBNIK S., Locality-sensitive binary codes from shift-invariant kernels [C], Proceedings of the Neural Information Processing Systems, pp. 1509-1517, (2009)
  • [4] GONG Y C, LAZEBNIK S, GORDO A, Et al., Iterative quantization:A procrustean approach to learning binary codes [C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 817-824, (2011)
  • [5] LIU W, WANG J, JI R R, Et al., Supervised hashing with kernels [C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2074-2081, (2012)
  • [6] ZHANG Z, ZOU Q, LIN Y W, Et al., Improved deep hashing with soft pairwise similarity for multi-label image retrieval[J], Journal of IEEE Transactions on Multimedia, 22, 2, pp. 540-553, (2019)
  • [7] ZHUANG B H, LIN G S, SHEN C H, Et al., Fast training of triplet-based deep binary embedding networks [C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5955-5964, (2016)
  • [8] ZENG H E, LAI H J, YIN J, Et al., Simultaneous region localization and hash coding for fine-grained image retrieval
  • [9] CHEN Y, BAI Y L, ZHANG W, Et al., Destruction and construction learning for fine-grained image recognition [C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5157-5166, (2019)
  • [10] WANG G S, YUAN Y F, CHEN X, Et al., Learning discriminative features with multiple granularities for person re-identification [C], Proceedings of the ACM on Multimedia Conference, pp. 274-282, (2018)