A triple fusion model for cross-modal deep hashing retrieval

被引:1
|
作者
Wang, Hufei [1 ]
Zhao, Kaiqiang [1 ]
Zhao, Dexin [1 ]
机构
[1] Tianjin Univ Technol, Tianjin Key Lab Intelligence Comp & Novel Softwar, 391 West Binshui Rd, Tianjin 300384, Peoples R China
关键词
Hashing learning; Cross-modal retrieval; Semantic similarity; Shared semantics;
D O I
10.1007/s00530-022-01005-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the field of resource retrieval, deep cross-modal retrieval is attracting increasing attention. It has a lower storage capacity and faster retrieval speed. However, most of the current methods put their attention on the semantic similarity between hash codes. They ignore the similarity between features extracted by the model from different modalities, which leads them to achieve suboptimal results. In addition, the correlation between different modalities is difficult to exploit adequately. Therefore, in order to enhance the information correlation between different modalities, a triple fusion model for cross-modal deep hashing retrieval (SSTFH) is proposed in this paper. To weaken the missing feature information when features pass through the fully connected layer, we designed a triple fusion strategy. Specifically, the first fusion and the second fusion are performed for images and text respectively, to obtain pattern-specific features. The third fusion is used to obtain more relevant semantic features. In addition, we attempt to use shared semantic information from semantic features to guide the model in extracting correlations between different modalities. Comprehensive experiments have been conducted on the benchmark IAPR TC-12 and MS COCO datasets. On MS COCO, our approach outperforms all the deep baselines by an average of 7.74% on the image-to-text task, and by 8.72% on the text-to-image task. On IAPR TC-12, our approach averagely improves image retrieval by 7.07% and text retrieval by 4.88%.
引用
收藏
页码:347 / 359
页数:13
相关论文
共 50 条
  • [31] Asymmetric Supervised Fusion-Oriented Hashing for Cross-Modal Retrieval
    Yang, Zhan
    Deng, Xiyin
    Guo, Lin
    Long, Jun
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (02) : 851 - 864
  • [32] Boosting deep cross-modal retrieval hashing with adversarially robust training
    Xingwei Zhang
    Xiaolong Zheng
    Wenji Mao
    Daniel Dajun Zeng
    [J]. Applied Intelligence, 2023, 53 : 23698 - 23710
  • [33] Boosting deep cross-modal retrieval hashing with adversarially robust training
    Zhang, Xingwei
    Zheng, Xiaolong
    Mao, Wenji
    Zeng, Daniel Dajun
    [J]. APPLIED INTELLIGENCE, 2023, 53 (20) : 23698 - 23710
  • [34] Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval
    Wu, Lin
    Wang, Yang
    Shao, Ling
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1602 - 1612
  • [35] Deep Multi-Level Semantic Hashing for Cross-Modal Retrieval
    Ji, Zhenyan
    Yao, Weina
    Wei, Wei
    Song, Houbing
    Pi, Huaiyu
    [J]. IEEE ACCESS, 2019, 7 : 23667 - 23674
  • [36] Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval
    Zhang, Xi
    Lai, Hanjiang
    Feng, Jiashi
    [J]. COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 614 - 629
  • [37] Triplet-Based Deep Hashing Network for Cross-Modal Retrieval
    Deng, Cheng
    Chen, Zhaojia
    Liu, Xianglong
    Gao, Xinbo
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) : 3893 - 3903
  • [38] Label-Based Deep Semantic Hashing for Cross-Modal Retrieval
    Weng, Weiwei
    Wu, Jiagao
    Yang, Lu
    Liu, Linfeng
    Hu, Bin
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2019), PT III, 2019, 11955 : 24 - 36
  • [39] Deep Adversarial Cascaded Hashing for Cross-Modal Vessel Image Retrieval
    Guo, Jiaen
    Guan, Xin
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 2205 - 2220
  • [40] Targeted Adversarial Attack Against Deep Cross-Modal Hashing Retrieval
    Wang, Tianshi
    Zhu, Lei
    Zhang, Zheng
    Zhang, Huaxiang
    Han, Junwei
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 6159 - 6172