A triple fusion model for cross-modal deep hashing retrieval

被引：1

作者：

Wang, Hufei ^{[1
]}

Zhao, Kaiqiang ^{[1
]}

Zhao, Dexin ^{[1
]}

机构：

[1] Tianjin Univ Technol, Tianjin Key Lab Intelligence Comp & Novel Softwar, 391 West Binshui Rd, Tianjin 300384, Peoples R China

来源：

MULTIMEDIA SYSTEMS | 2023年 / 29卷 / 01期

关键词：

Hashing learning; Cross-modal retrieval; Semantic similarity; Shared semantics;

D O I：

10.1007/s00530-022-01005-6

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the field of resource retrieval, deep cross-modal retrieval is attracting increasing attention. It has a lower storage capacity and faster retrieval speed. However, most of the current methods put their attention on the semantic similarity between hash codes. They ignore the similarity between features extracted by the model from different modalities, which leads them to achieve suboptimal results. In addition, the correlation between different modalities is difficult to exploit adequately. Therefore, in order to enhance the information correlation between different modalities, a triple fusion model for cross-modal deep hashing retrieval (SSTFH) is proposed in this paper. To weaken the missing feature information when features pass through the fully connected layer, we designed a triple fusion strategy. Specifically, the first fusion and the second fusion are performed for images and text respectively, to obtain pattern-specific features. The third fusion is used to obtain more relevant semantic features. In addition, we attempt to use shared semantic information from semantic features to guide the model in extracting correlations between different modalities. Comprehensive experiments have been conducted on the benchmark IAPR TC-12 and MS COCO datasets. On MS COCO, our approach outperforms all the deep baselines by an average of 7.74% on the image-to-text task, and by 8.72% on the text-to-image task. On IAPR TC-12, our approach averagely improves image retrieval by 7.07% and text retrieval by 4.88%.

引用

页码：347 / 359

页数：13

共 50 条

[31] Asymmetric Supervised Fusion-Oriented Hashing for Cross-Modal Retrieval
Yang, Zhan
Deng, Xiyin
Guo, Lin
Long, Jun
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (02) : 851 - 864
[32] Boosting deep cross-modal retrieval hashing with adversarially robust training
Xingwei Zhang
Xiaolong Zheng
Wenji Mao
Daniel Dajun Zeng
[J]. Applied Intelligence, 2023, 53 : 23698 - 23710
[33] Boosting deep cross-modal retrieval hashing with adversarially robust training
Zhang, Xingwei
Zheng, Xiaolong
Mao, Wenji
Zeng, Daniel Dajun
[J]. APPLIED INTELLIGENCE, 2023, 53 (20) : 23698 - 23710
[34] Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval
Wu, Lin
Wang, Yang
Shao, Ling
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1602 - 1612
[35] Deep Multi-Level Semantic Hashing for Cross-Modal Retrieval
Ji, Zhenyan
Yao, Weina
Wei, Wei
Song, Houbing
Pi, Huaiyu
[J]. IEEE ACCESS, 2019, 7 : 23667 - 23674
[36] Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval
Zhang, Xi
Lai, Hanjiang
Feng, Jiashi
[J]. COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 614 - 629
[37] Triplet-Based Deep Hashing Network for Cross-Modal Retrieval
Deng, Cheng
Chen, Zhaojia
Liu, Xianglong
Gao, Xinbo
Tao, Dacheng
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) : 3893 - 3903
[38] Label-Based Deep Semantic Hashing for Cross-Modal Retrieval
Weng, Weiwei
Wu, Jiagao
Yang, Lu
Liu, Linfeng
Hu, Bin
[J]. NEURAL INFORMATION PROCESSING (ICONIP 2019), PT III, 2019, 11955 : 24 - 36
[39] Deep Adversarial Cascaded Hashing for Cross-Modal Vessel Image Retrieval
Guo, Jiaen
Guan, Xin
[J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 2205 - 2220
[40] Targeted Adversarial Attack Against Deep Cross-Modal Hashing Retrieval
Wang, Tianshi
Zhu, Lei
Zhang, Zheng
Zhang, Huaxiang
Han, Junwei
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 6159 - 6172

← 1 2 3 4 5 →