Semantic-Driven Interpretable Deep Multi-Modal Hashing for Large-Scale Multimedia Retrieval

被引:25
|
作者
Lu, Xu [1 ]
Liu, Li [1 ]
Nie, Liqiang [2 ]
Chang, Xiaojun [3 ]
Zhang, Huaxiang [1 ]
机构
[1] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan 250358, Peoples R China
[2] Shandong Univ, Sch Comp Sci & Technol, Qingdao 266237, Peoples R China
[3] Monash Univ, Fac Informat Technol, Clayton, Vic 3800, Australia
基金
中国国家自然科学基金;
关键词
Semantics; Task analysis; Data models; Feature extraction; Redundancy; Fuses; Optimization; Multi-modal hashing; large-scale multimedia; retrieval; interpretable hashing;
D O I
10.1109/TMM.2020.3044473
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-modal hashing focuses on fusing different modalities and exploring the complementarity of heterogeneous multi-modal data for compact hash learning. However, existing multi-modal hashing methods still suffer from several problems, including: 1) Almost all existing methods generate unexplainable hash codes. They roughly assume that the contribution of each hash code bit to the retrieval results is the same, ignoring the discriminative information embedded in hash learning and semantic similarity in hash retrieval. Moreover, the length of hash code is empirically set, which will cause bit redundancy and affect retrieval accuracy. 2) Most existing methods exploit shallow models which fail to fully capture higher-level correlation of multi-modal data. 3) Most existing methods adopt online hashing strategy based on immutable direct projection, which generates query codes for new samples without considering the differences of semantic categories. In this paper, we propose a Semantic-driven Interpretable Deep Multi-modal Hashing (SIDMH) method to generate interpretable hash codes driven by semantic categories within a deep hashing architecture, which can solve all these three problems in an integrated model. The main contributions are: 1) A novel deep multi-modal hashing network is developed to progressively extract hidden representations of heterogeneous modality features and deeply exploit the complementarity of multi-modal data. 2) Learning interpretable hash codes, with discriminant information of different categories distinctively embedded into hash codes and their different impacts on hash retrieval intuitively explained. Besides, the code length depends on the number of categories in the dataset, which can reduce the bit redundancy and improve the retrieval accuracy. 3) The semantic-driven online hashing strategy encodes the significant branches and discards the negligible branches of each query sample according to the semantics contained in it, therefore it could capture different semantics in dynamic queries. Finally, we consider both the nearest neighbor similarity and semantic similarity of hash codes. Experiments on several public multimedia retrieval datasets validate the superiority of the proposed method.
引用
收藏
页码:4541 / 4554
页数:14
相关论文
共 50 条
  • [41] Unsupervised Multi-modal Hashing for Cross-Modal Retrieval
    Yu, Jun
    Wu, Xiao-Jun
    Zhang, Donglin
    COGNITIVE COMPUTATION, 2022, 14 (03) : 1159 - 1171
  • [42] Multi-Task Multi-modal Semantic Hashing for Web Image Retrieval with Limited Supervision
    Xie, Liang
    Zhu, Lei
    Cheng, Zhiyong
    MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 465 - 477
  • [43] Unsupervised Multi-modal Hashing for Cross-Modal Retrieval
    Jun Yu
    Xiao-Jun Wu
    Donglin Zhang
    Cognitive Computation, 2022, 14 : 1159 - 1171
  • [44] Efficient Large-Scale Multi-Modal Classification
    Kiela, Douwe
    Grave, Edouard
    Joulin, Armand
    Mikolov, Tomas
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5198 - 5204
  • [45] Parametric CAD Primitive Retrieval via Multi-Modal Fusion and Deep Hashing
    Xu, Minyang
    Lou, Yunzhong
    Ma, Weijian
    Li, Xueyang
    Zhou, Xiangdong
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1061 - 1069
  • [46] LCEMH: Label Correlation Enhanced Multi-modal Hashing for efficient multi-modal retrieval
    Zheng, Chaoqun
    Zhu, Lei
    Zhang, Zheng
    Duan, Wenjun
    Lu, Wenpeng
    INFORMATION SCIENCES, 2024, 659
  • [47] Deep Multi-Level Semantic Hashing for Cross-Modal Retrieval
    Ji, Zhenyan
    Yao, Weina
    Wei, Wei
    Song, Houbing
    Pi, Huaiyu
    IEEE ACCESS, 2019, 7 : 23667 - 23674
  • [48] A Framework for Enabling Unpaired Multi-Modal Learning for Deep Cross-Modal Hashing Retrieval
    Williams-Lekuona, Mikel
    Cosma, Georgina
    Phillips, Iain
    JOURNAL OF IMAGING, 2022, 8 (12)
  • [49] An Enhanced Deep Hashing Method for Large-Scale Image Retrieval
    Chen, Cong
    Tong, Weiqin
    Ding, Xuehai
    Zhi, Xiaoli
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I, 2019, 11775 : 382 - 393
  • [50] Spatial pyramid deep hashing for large-scale image retrieval
    Zhao, Wanqing
    Luo, Hangzai
    Peng, Jinye
    Fan, Jianping
    NEUROCOMPUTING, 2017, 243 : 166 - 173