Deep semantic hashing with dual attention for cross-modal retrieval

被引:5
|
作者
Wu, Jiagao [1 ,2 ]
Weng, Weiwei [1 ,2 ]
Fu, Junxia [1 ,2 ]
Liu, Linfeng [1 ,2 ]
Hu, Bin [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Comp, Nanjing 210023, Jiangsu, Peoples R China
[2] Jiangsu Key Lab Big Data Secur & Intelligent Proc, Nanjing 210023, Jiangsu, Peoples R China
[3] Nanjing Normal Univ, Key Lab Virtual Geog Environm, Minist Educ, Nanjing 210046, Jiangsu, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2022年 / 34卷 / 07期
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; Deep hashing; Semantic label network; Attention mechanism; CODES;
D O I
10.1007/s00521-021-06696-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the explosive growth of multimodal data, cross-modal retrieval has drawn increasing research interests. Hashing-based methods have made great advancements in cross-modal retrieval due to the benefits of low storage cost and fast query speed. However, there still exists a crucial challenge to improve the accuracy of cross-modal retrieval due to the heterogeneity gap between modalities. To further tackle this problem, in this paper, we propose a new two-staged cross-modal retrieval method, called Deep Semantic Hashing with Dual Attention (DSHDA). In the first stage of DSHDA, a Semantic Label Network (SeLabNet) is designed to extract label semantic features and hash codes by training the multi-label annotations, which can make the learning of different modalities in a common semantic space and bridge the modality gap effectively. In the second stage of DSHDA, we propose a deep neural network to simultaneously integrate feature and hash code learning for each modality into the same framework, the training of the framework is guided by the label semantic features and hash codes generated from SeLabNet to maximize the cross-modal semantic relevance. Moreover, dual attention mechanisms are used in our neural networks: (1) Lo-attention is used to extract the local key information of each modality and improve the quality of modality features. (2) Co-attention is used to strengthen the relationship between different modalities to produce more consistent and accurate hash codes. Extensive experiments on two real datasets with image-text modalities demonstrate the superiority of the proposed method in cross-modal retrieval tasks.
引用
收藏
页码:5397 / 5416
页数:20
相关论文
共 50 条
  • [41] Fine-grained similarity semantic preserving deep hashing for cross-modal retrieval
    Li, Guoyou
    Peng, Qingjun
    Zou, Dexu
    Yang, Jinyue
    Shu, Zhenqiu
    [J]. FRONTIERS IN PHYSICS, 2023, 11
  • [42] ONION: Online Semantic Autoencoder Hashing for Cross-Modal Retrieval
    Zhang, Donglin
    Wu, Xiao-Jun
    Chen, Guoqing
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)
  • [43] Semantic Constraints Matrix Factorization Hashing for cross-modal retrieval
    Li, Weian
    Xiong, Haixia
    Ou, Weihua
    Gou, Jianping
    Deng, Jiaxing
    Liang, Linqing
    Zhou, Quan
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2022, 100
  • [44] Hierarchical Semantic Structure Preserving Hashing for Cross-Modal Retrieval
    Wang, Di
    Zhang, Caiping
    Wang, Quan
    Tian, Yumin
    He, Lihuo
    Zhao, Lin
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1217 - 1229
  • [45] Discrete Semantic Matrix Factorization Hashing for Cross-Modal Retrieval
    Qin, Jianyang
    Fei, Lunke
    Teng, Shaohua
    Zhang, Wei
    Liu, Dongning
    Zhao, Genping
    Yuan, Haoliang
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1550 - 1557
  • [46] Deep robust multilevel semantic hashing for multi-label cross-modal retrieval
    Song, Ge
    Tan, Xiaoyang
    Zhao, Jun
    Yang, Ming
    [J]. PATTERN RECOGNITION, 2021, 120
  • [47] Hierarchical semantic interaction-based deep hashing network for cross-modal retrieval
    Chen, Shubai
    Wu, Song
    Wang, Li
    [J]. PEERJ COMPUTER SCIENCE, 2021,
  • [48] DEEP RANK CROSS-MODAL HASHING WITH SEMANTIC CONSISTENT FOR IMAGE-TEXT RETRIEVAL
    Liu, Xiaoqing
    Zeng, Huanqiang
    Shi, Yifan
    Zhu, Jianqing
    Ma, Kai-Kuang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4828 - 4832
  • [49] Dual-pathway Attention based Supervised Adversarial Hashing for Cross-modal Retrieval
    Wang, Xiaoxiao
    Liang, Meiyu
    Cao, Xiaowen
    Du, Junping
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2021), 2021, : 168 - 171
  • [50] Deep Semantic Mapping for Cross-Modal Retrieval
    Wang, Cheng
    Yang, Haojin
    Meinel, Christoph
    [J]. 2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 234 - 241