Deep semantic hashing with dual attention for cross-modal retrieval

被引:5
|
作者
Wu, Jiagao [1 ,2 ]
Weng, Weiwei [1 ,2 ]
Fu, Junxia [1 ,2 ]
Liu, Linfeng [1 ,2 ]
Hu, Bin [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Comp, Nanjing 210023, Jiangsu, Peoples R China
[2] Jiangsu Key Lab Big Data Secur & Intelligent Proc, Nanjing 210023, Jiangsu, Peoples R China
[3] Nanjing Normal Univ, Key Lab Virtual Geog Environm, Minist Educ, Nanjing 210046, Jiangsu, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2022年 / 34卷 / 07期
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; Deep hashing; Semantic label network; Attention mechanism; CODES;
D O I
10.1007/s00521-021-06696-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the explosive growth of multimodal data, cross-modal retrieval has drawn increasing research interests. Hashing-based methods have made great advancements in cross-modal retrieval due to the benefits of low storage cost and fast query speed. However, there still exists a crucial challenge to improve the accuracy of cross-modal retrieval due to the heterogeneity gap between modalities. To further tackle this problem, in this paper, we propose a new two-staged cross-modal retrieval method, called Deep Semantic Hashing with Dual Attention (DSHDA). In the first stage of DSHDA, a Semantic Label Network (SeLabNet) is designed to extract label semantic features and hash codes by training the multi-label annotations, which can make the learning of different modalities in a common semantic space and bridge the modality gap effectively. In the second stage of DSHDA, we propose a deep neural network to simultaneously integrate feature and hash code learning for each modality into the same framework, the training of the framework is guided by the label semantic features and hash codes generated from SeLabNet to maximize the cross-modal semantic relevance. Moreover, dual attention mechanisms are used in our neural networks: (1) Lo-attention is used to extract the local key information of each modality and improve the quality of modality features. (2) Co-attention is used to strengthen the relationship between different modalities to produce more consistent and accurate hash codes. Extensive experiments on two real datasets with image-text modalities demonstrate the superiority of the proposed method in cross-modal retrieval tasks.
引用
收藏
页码:5397 / 5416
页数:20
相关论文
共 50 条
  • [21] Joint Specifics and Dual-Semantic Hashing Learning for Cross-Modal Retrieval
    Teng, Shaohua
    Lin, Shengjie
    Teng, Luyao
    Wu, Naiqi
    Zheng, Zefeng
    Fei, Lunke
    Zhang, Wei
    [J]. Neurocomputing, 2024, 565
  • [22] Dual-supervised attention network for deep cross-modal hashing
    Peng, Hanyu
    He, Junjun
    Chen, Shifeng
    Wang, Yali
    Qiao, Yu
    [J]. PATTERN RECOGNITION LETTERS, 2019, 128 : 333 - 339
  • [23] Joint Specifics and Dual-Semantic Hashing Learning for Cross-Modal Retrieval
    Teng, Shaohua
    Lin, Shengjie
    Teng, Luyao
    Wu, Naiqi
    Zheng, Zefeng
    Fei, Lunke
    Zhang, Wei
    [J]. NEUROCOMPUTING, 2024, 565
  • [24] Multilevel Deep Semantic Feature Asymmetric Network for Cross-Modal Hashing Retrieval
    Jiang, Xiaolong
    Fan, Jiabao
    Zhang, Jie
    Lin, Ziyong
    Li, Mingyong
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2024, 22 (08) : 621 - 631
  • [25] Deep Semantic Correlation Learning based Hashing for Multimedia Cross-Modal Retrieval
    Gong, Xiaolong
    Huang, Linpeng
    Wang, Fuwei
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 117 - 126
  • [26] Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval
    Zhang, Cheng
    Wan, Yuan
    Qiang, Haopeng
    [J]. NEURAL COMPUTING & APPLICATIONS, 2024, 36 (10): : 5383 - 5397
  • [27] Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval
    Cheng Zhang
    Yuan Wan
    Haopeng Qiang
    [J]. Neural Computing and Applications, 2024, 36 : 5383 - 5397
  • [28] Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
    Cheng, Shuli
    Wang, Liejun
    Du, Anyu
    [J]. ENTROPY, 2020, 22 (11) : 1 - 22
  • [29] Deep Hashing Similarity Learning for Cross-Modal Retrieval
    Ma, Ying
    Wang, Meng
    Lu, Guangyun
    Sun, Yajun
    [J]. IEEE ACCESS, 2024, 12 : 8609 - 8618
  • [30] Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval
    Zhan, Yu-Wei
    Luo, Xin
    Wang, Yongxin
    Xu, Xin-Shun
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3386 - 3394