Deep semantic hashing with dual attention for cross-modal retrieval

被引:5
|
作者
Wu, Jiagao [1 ,2 ]
Weng, Weiwei [1 ,2 ]
Fu, Junxia [1 ,2 ]
Liu, Linfeng [1 ,2 ]
Hu, Bin [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Comp, Nanjing 210023, Jiangsu, Peoples R China
[2] Jiangsu Key Lab Big Data Secur & Intelligent Proc, Nanjing 210023, Jiangsu, Peoples R China
[3] Nanjing Normal Univ, Key Lab Virtual Geog Environm, Minist Educ, Nanjing 210046, Jiangsu, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2022年 / 34卷 / 07期
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; Deep hashing; Semantic label network; Attention mechanism; CODES;
D O I
10.1007/s00521-021-06696-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the explosive growth of multimodal data, cross-modal retrieval has drawn increasing research interests. Hashing-based methods have made great advancements in cross-modal retrieval due to the benefits of low storage cost and fast query speed. However, there still exists a crucial challenge to improve the accuracy of cross-modal retrieval due to the heterogeneity gap between modalities. To further tackle this problem, in this paper, we propose a new two-staged cross-modal retrieval method, called Deep Semantic Hashing with Dual Attention (DSHDA). In the first stage of DSHDA, a Semantic Label Network (SeLabNet) is designed to extract label semantic features and hash codes by training the multi-label annotations, which can make the learning of different modalities in a common semantic space and bridge the modality gap effectively. In the second stage of DSHDA, we propose a deep neural network to simultaneously integrate feature and hash code learning for each modality into the same framework, the training of the framework is guided by the label semantic features and hash codes generated from SeLabNet to maximize the cross-modal semantic relevance. Moreover, dual attention mechanisms are used in our neural networks: (1) Lo-attention is used to extract the local key information of each modality and improve the quality of modality features. (2) Co-attention is used to strengthen the relationship between different modalities to produce more consistent and accurate hash codes. Extensive experiments on two real datasets with image-text modalities demonstrate the superiority of the proposed method in cross-modal retrieval tasks.
引用
收藏
页码:5397 / 5416
页数:20
相关论文
共 50 条
  • [31] Deep Multiscale Fusion Hashing for Cross-Modal Retrieval
    Nie, Xiushan
    Wang, Bowei
    Li, Jiajia
    Hao, Fanchang
    Jian, Muwei
    Yin, Yilong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (01) : 401 - 410
  • [32] Joint Semantic Preserving Sparse Hashing for Cross-Modal Retrieval
    Hu, Zhikai
    Cheung, Yiu-Ming
    Li, Mengke
    Lan, Weichao
    Zhang, Donglin
    Liu, Qiang
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2989 - 3002
  • [33] Adaptive Marginalized Semantic Hashing for Unpaired Cross-Modal Retrieval
    Luo, Kaiyi
    Zhang, Chao
    Li, Huaxiong
    Jia, Xiuyi
    Chen, Chunlin
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 9082 - 9095
  • [34] Semantic Boosting Cross-Modal Hashing for efficient multimedia retrieval
    Wang, Ke
    Tang, Jun
    Wang, Nian
    Shao, Ling
    [J]. INFORMATION SCIENCES, 2016, 330 : 199 - 210
  • [35] Semantic preserving asymmetric discrete hashing for cross-modal retrieval
    Fan Yang
    Qiao-xi Zhang
    Xiao-jian Ding
    Fu-min Ma
    Jie Cao
    De-yu Tong
    [J]. Applied Intelligence, 2023, 53 : 15352 - 15371
  • [36] Semantic preserving asymmetric discrete hashing for cross-modal retrieval
    Yang, Fan
    Zhang, Qiao-xi
    Ding, Xiao-jian
    Ma, Fu-min
    Cao, Jie
    Tong, De-yu
    [J]. APPLIED INTELLIGENCE, 2023, 53 (12) : 15352 - 15371
  • [37] Discrete semantic embedding hashing for scalable cross-modal retrieval
    Liu, Junjie
    Fei, Lunke
    Jia, Wei
    Zhao, Shuping
    Wen, Jie
    Teng, Shaohua
    Zhang, Wei
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1461 - 1467
  • [38] Label-wise Deep Semantic-Alignment Hashing for Cross-Modal Retrieval
    Li, Liang
    Sun, Weiwei
    [J]. PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 416 - 424
  • [39] DEEP RANK CROSS-MODAL HASHING WITH SEMANTIC CONSISTENT FOR IMAGE-TEXT RETRIEVAL
    Liu, Xiaoqing
    Zeng, Huanqiang
    Shi, Yifan
    Zhu, Jianqing
    Ma, Kai-Kuang
    [J]. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2022, 2022-May : 4828 - 4832
  • [40] Hierarchical semantic interaction-based deep hashing network for cross-modal retrieval
    Chen, Shubai
    Wu, Song
    Wang, Li
    [J]. PeerJ Computer Science, 2021, 7 : 1 - 20