Semantic-Driven Interpretable Deep Multi-Modal Hashing for Large-Scale Multimedia Retrieval

被引：25

作者：

Lu, Xu ^{[1
]}

Liu, Li ^{[1
]}

Nie, Liqiang ^{[2
]}

Chang, Xiaojun ^{[3
]}

Zhang, Huaxiang ^{[1
]}

机构：

[1] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan 250358, Peoples R China

[2] Shandong Univ, Sch Comp Sci & Technol, Qingdao 266237, Peoples R China

[3] Monash Univ, Fac Informat Technol, Clayton, Vic 3800, Australia

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2021年 / 23卷

基金：

中国国家自然科学基金;

关键词：

Semantics; Task analysis; Data models; Feature extraction; Redundancy; Fuses; Optimization; Multi-modal hashing; large-scale multimedia; retrieval; interpretable hashing;

D O I：

10.1109/TMM.2020.3044473

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-modal hashing focuses on fusing different modalities and exploring the complementarity of heterogeneous multi-modal data for compact hash learning. However, existing multi-modal hashing methods still suffer from several problems, including: 1) Almost all existing methods generate unexplainable hash codes. They roughly assume that the contribution of each hash code bit to the retrieval results is the same, ignoring the discriminative information embedded in hash learning and semantic similarity in hash retrieval. Moreover, the length of hash code is empirically set, which will cause bit redundancy and affect retrieval accuracy. 2) Most existing methods exploit shallow models which fail to fully capture higher-level correlation of multi-modal data. 3) Most existing methods adopt online hashing strategy based on immutable direct projection, which generates query codes for new samples without considering the differences of semantic categories. In this paper, we propose a Semantic-driven Interpretable Deep Multi-modal Hashing (SIDMH) method to generate interpretable hash codes driven by semantic categories within a deep hashing architecture, which can solve all these three problems in an integrated model. The main contributions are: 1) A novel deep multi-modal hashing network is developed to progressively extract hidden representations of heterogeneous modality features and deeply exploit the complementarity of multi-modal data. 2) Learning interpretable hash codes, with discriminant information of different categories distinctively embedded into hash codes and their different impacts on hash retrieval intuitively explained. Besides, the code length depends on the number of categories in the dataset, which can reduce the bit redundancy and improve the retrieval accuracy. 3) The semantic-driven online hashing strategy encodes the significant branches and discards the negligible branches of each query sample according to the semantics contained in it, therefore it could capture different semantics in dynamic queries. Finally, we consider both the nearest neighbor similarity and semantic similarity of hash codes. Experiments on several public multimedia retrieval datasets validate the superiority of the proposed method.

引用

页码：4541 / 4554

页数：14

共 50 条

[31] Deep semantic preserving hashing for large scale image retrieval
Masoumeh Zareapoor
Jie Yang
Deepak Kumar Jain
Pourya Shamsolmoali
Neha Jain
Surya Kant
Multimedia Tools and Applications, 2019, 78 : 23831 - 23846
[32] Deep semantic preserving hashing for large scale image retrieval
Zareapoor, Masoumeh
Yang, Jie
Jain, Deepak Kumar
Shamsolmoali, Pourya
Jain, Neha
Kant, Surya
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (17) : 23831 - 23846
[33] LASH: Large-Scale Academic Deep Semantic Hashing
Guo, Jia-Nan
Mao, Xian-Ling
Lan, Tian
Tu, Rong-Xin
Wei, Wei
Huang, Heyan
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (02) : 1734 - 1746
[34] Hadamard matrix-guided multi-modal hashing for multi-modal retrieval
Yu, Jun
Huang, Wei
Li, Zuhe
Shu, Zhenqiu
Zhu, Liang
DIGITAL SIGNAL PROCESSING, 2022, 130
[35] Deep multi-negative supervised hashing for large-scale image retrieval
Liu, Yingfan
Qiao, Xiaotian
Liu, Zhaoqing
Xia, Xiaofang
Zhang, Yinlong
Cui, Jiangtao
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 264
[36] Deep Semantic Correlation Learning based Hashing for Multimedia Cross-Modal Retrieval
Gong, Xiaolong
Huang, Linpeng
Wang, Fuwei
2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 117 - 126
[37] Cascaded Deep Hashing for Large-Scale Image Retrieval
Lu, Jun
Zhang, Li
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT VI, 2018, 11306 : 419 - 429
[38] Deep Supervised Hashing for Multi-Label and Large-Scale Image Retrieval
Wu, Dayan
Lin, Zheng
Li, Bo
Ye, Mingzhen
Wang, Weiping
PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, : 155 - 163
[39] Retrieval From and Understanding of Large-Scale Multi-modal Medical Datasets: A Review
Mueller, Henning
Unay, Devrim
IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (09) : 2093 - 2104
[40] Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval
Li, Mingyong
Wang, Hongya
PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 183 - 191

← 1 2 3 4 5 →