Cognitive multi-modal consistent hashing with flexible semantic transformation

被引:15
|
作者
An, Junfeng [1 ]
Luo, Haoyang [1 ]
Zhang, Zheng [1 ]
Zhu, Lei [2 ]
Lu, Guangming [1 ]
机构
[1] Harbin Inst Technol, Shenzhen Key Lab Visual Object Detect & Recognit, Shenzhen 518055, Peoples R China
[2] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan 250000, Peoples R China
基金
中国国家自然科学基金;
关键词
Social geo-media; Learning to hash; Semantic preserving; Discrete optimization; Similarity search; INFORMATION; MULTIMEDIA;
D O I
10.1016/j.ipm.2021.102743
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-modal hashing can encode the large-scale social geo-media multimedia data from multiple sources into a common discrete hash space, in which the heterogeneous correlations from multiple modalities could be well explored and preserved into the objective semantic-consistent hash codes. The current researches on multi-modal hashing mainly focus on performing common data reconstruction, but they fail to effectively distill the intrinsic and consensus structures of multi-modal data and fully exploit the inherent semantic knowledge to capture semantic-consistent information across multiple modalities, leading to unsatisfactory retrieval performance. To facilitate this problem and develop an efficient multi-modal geographical retrieval method, in this article, we propose a discriminative multi-modal hashing framework named Cognitive Multi-modal Consistent Hashing (CMCH), which can progressively pursue the structure consensus over heterogeneous multi-modal data and simultaneously explore the informative transformed semantics. Specifically, we construct a parameter-free collaborative multi-modal fusion module to incorporate and excavate the underlying common components from multi-source data. Particularly, our formulation seeks for a joint multi-modal compatibility among multiple modalities under a self-adaptive weighting manner, which can take full advantages of their complementary properties. Moreover, a cognitive self-paced learning policy is further leveraged to conduct progressive feature aggregation, which can coalesce multi-modal data onto the established common latent space in a curriculum learning mode. Furthermore, deep semantic transform learning is elaborated to generate flexible semantics for interactively guiding collaborative hash codes learning. An efficient discrete learning algorithm is devised to address the resulting optimization problem, which obtains stable solutions when dealing with large-scale multi-modal retrieval tasks. Sufficient experiments performed on four largescale multi-modal datasets demonstrate the encouraging performance of the proposed CMCH method in comparison with the state-of-the-arts over multi-modal information retrieval and computational efficiency. The source codes of this work could be available at https://github. com/JunfengAn1998a/CMCH .
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Flexible Dual Multi-Modal Hashing for Incomplete Multi-Modal Retrieval
    Wei, Yuhong
    An, Junfeng
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2024,
  • [2] Hashing-based Multi-modal Semantic Communication
    Zhu, Yibo
    Gu, Hongyu
    Nie, Jiangtian
    Tang, Jianhang
    Jin, Jiangming
    Zhang, Yang
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [3] Flexible Multi-modal Hashing for Scalable Multimedia Retrieval
    Zhu, Lei
    Lu, Xu
    Cheng, Zhiyong
    Li, Jingjing
    Zhang, Huaxiang
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2020, 11 (02)
  • [4] Graph Convolutional Multi-modal Hashing for Flexible Multimedia Retrieval
    Lu, Xu
    Zhu, Lei
    Liu, Li
    Nie, Liqiang
    Zhang, Huaxiang
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1414 - 1422
  • [5] Sparse Multi-Modal Hashing
    Wu, Fei
    Yu, Zhou
    Yang, Yi
    Tang, Siliang
    Zhang, Yin
    Zhuang, Yueting
    IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (02) : 427 - 439
  • [6] Bit-aware Semantic Transformer Hashing for Multi-modal Retrieval
    Tan, Wentao
    Zhu, Lei
    Guan, Weili
    Li, Jingjing
    Cheng, Zhiyong
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 982 - 991
  • [7] Hadamard matrix-guided multi-modal hashing for multi-modal retrieval
    Yu, Jun
    Huang, Wei
    Li, Zuhe
    Shu, Zhenqiu
    Zhu, Liang
    DIGITAL SIGNAL PROCESSING, 2022, 130
  • [8] Flexible Online Multi-modal Hashing for Large-scale Multimedia Retrieval
    Lu, Xu
    Zhu, Lei
    Cheng, Zhiyong
    Li, Jingjing
    Nie, Xiushan
    Zhang, Huaxiang
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1129 - 1137
  • [9] Multi-Facet Weighted Asymmetric Multi-Modal Hashing Based on Latent Semantic Distribution
    Lu, Xu
    Liu, Li
    Ning, Lixin
    Zhang, Liang
    Mu, Shaomin
    Zhang, Huaxiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7307 - 7320
  • [10] Multi-Task Multi-modal Semantic Hashing for Web Image Retrieval with Limited Supervision
    Xie, Liang
    Zhu, Lei
    Cheng, Zhiyong
    MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 465 - 477