A triple fusion model for cross-modal deep hashing retrieval

被引：1

作者：

Wang, Hufei ^{[1
]}

Zhao, Kaiqiang ^{[1
]}

Zhao, Dexin ^{[1
]}

机构：

[1] Tianjin Univ Technol, Tianjin Key Lab Intelligence Comp & Novel Softwar, 391 West Binshui Rd, Tianjin 300384, Peoples R China

来源：

MULTIMEDIA SYSTEMS | 2023年 / 29卷 / 01期

关键词：

Hashing learning; Cross-modal retrieval; Semantic similarity; Shared semantics;

D O I：

10.1007/s00530-022-01005-6

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the field of resource retrieval, deep cross-modal retrieval is attracting increasing attention. It has a lower storage capacity and faster retrieval speed. However, most of the current methods put their attention on the semantic similarity between hash codes. They ignore the similarity between features extracted by the model from different modalities, which leads them to achieve suboptimal results. In addition, the correlation between different modalities is difficult to exploit adequately. Therefore, in order to enhance the information correlation between different modalities, a triple fusion model for cross-modal deep hashing retrieval (SSTFH) is proposed in this paper. To weaken the missing feature information when features pass through the fully connected layer, we designed a triple fusion strategy. Specifically, the first fusion and the second fusion are performed for images and text respectively, to obtain pattern-specific features. The third fusion is used to obtain more relevant semantic features. In addition, we attempt to use shared semantic information from semantic features to guide the model in extracting correlations between different modalities. Comprehensive experiments have been conducted on the benchmark IAPR TC-12 and MS COCO datasets. On MS COCO, our approach outperforms all the deep baselines by an average of 7.74% on the image-to-text task, and by 8.72% on the text-to-image task. On IAPR TC-12, our approach averagely improves image retrieval by 7.07% and text retrieval by 4.88%.

引用

页码：347 / 359

页数：13

共 50 条

[21] Deep semantic hashing with dual attention for cross-modal retrieval
Wu, Jiagao
Weng, Weiwei
Fu, Junxia
Liu, Linfeng
Hu, Bin
[J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (07): : 5397 - 5416
[22] Deep Visual-Semantic Hashing for Cross-Modal Retrieval
Cao, Yue
Long, Mingsheng
Wang, Jianmin
Yang, Qiang
Yu, Philip S.
[J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1445 - 1454
[23] Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval
Yang, Erkun
Deng, Cheng
Liu, Wei
Liu, Xianglong
Tao, Dacheng
Gao, Xinbo
[J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1618 - 1625
[24] Cross-modal retrieval based on deep regularized hashing constraints
Khan, Asad
Hayat, Sakander
Ahmad, Muhammad
Wen, Jinyu
Farooq, Muhammad Umar
Fang, Meie
Jiang, Wenchao
[J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (09) : 6508 - 6530
[25] A novel deep translated attention hashing for cross-modal retrieval
Yu, Haibo
Ma, Ran
Su, Min
An, Ping
Li, Kai
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (18) : 26443 - 26461
[26] Hashing for Cross-Modal Similarity Retrieval
Liu, Yao
Yuan, Yanhong
Huang, Qiaoli
Huang, Zhixing
[J]. 2015 11TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2015, : 1 - 8
[27] Online deep hashing for both uni-modal and cross-modal retrieval
Xie, Yicai
Zeng, Xianhua
Wang, Tinghua
Yi, Yun
[J]. INFORMATION SCIENCES, 2022, 608 : 1480 - 1502
[28] Online deep hashing for both uni-modal and cross-modal retrieval
Xie, Yicai
Zeng, Xianhua
Wang, Tinghua
Yi, Yun
[J]. Information Sciences, 2022, 608 : 1480 - 1502
[29] Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval
Li, Mingyong
Wang, Hongya
[J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 183 - 191
[30] Asymmetric Supervised Fusion-Oriented Hashing for Cross-Modal Retrieval
Yang, Zhan
Deng, Xiyin
Guo, Lin
Long, Jun
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (02) : 851 - 864

← 1 2 3 4 5 →