Cognitive multi-modal consistent hashing with flexible semantic transformation

被引：15

作者：

An, Junfeng ^{[1
]}

Luo, Haoyang ^{[1
]}

Zhang, Zheng ^{[1
]}

Zhu, Lei ^{[2
]}

Lu, Guangming ^{[1
]}

机构：

[1] Harbin Inst Technol, Shenzhen Key Lab Visual Object Detect & Recognit, Shenzhen 518055, Peoples R China

[2] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan 250000, Peoples R China

来源：

INFORMATION PROCESSING & MANAGEMENT | 2022年 / 59卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Social geo-media; Learning to hash; Semantic preserving; Discrete optimization; Similarity search; INFORMATION; MULTIMEDIA;

D O I：

10.1016/j.ipm.2021.102743

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-modal hashing can encode the large-scale social geo-media multimedia data from multiple sources into a common discrete hash space, in which the heterogeneous correlations from multiple modalities could be well explored and preserved into the objective semantic-consistent hash codes. The current researches on multi-modal hashing mainly focus on performing common data reconstruction, but they fail to effectively distill the intrinsic and consensus structures of multi-modal data and fully exploit the inherent semantic knowledge to capture semantic-consistent information across multiple modalities, leading to unsatisfactory retrieval performance. To facilitate this problem and develop an efficient multi-modal geographical retrieval method, in this article, we propose a discriminative multi-modal hashing framework named Cognitive Multi-modal Consistent Hashing (CMCH), which can progressively pursue the structure consensus over heterogeneous multi-modal data and simultaneously explore the informative transformed semantics. Specifically, we construct a parameter-free collaborative multi-modal fusion module to incorporate and excavate the underlying common components from multi-source data. Particularly, our formulation seeks for a joint multi-modal compatibility among multiple modalities under a self-adaptive weighting manner, which can take full advantages of their complementary properties. Moreover, a cognitive self-paced learning policy is further leveraged to conduct progressive feature aggregation, which can coalesce multi-modal data onto the established common latent space in a curriculum learning mode. Furthermore, deep semantic transform learning is elaborated to generate flexible semantics for interactively guiding collaborative hash codes learning. An efficient discrete learning algorithm is devised to address the resulting optimization problem, which obtains stable solutions when dealing with large-scale multi-modal retrieval tasks. Sufficient experiments performed on four largescale multi-modal datasets demonstrate the encouraging performance of the proposed CMCH method in comparison with the state-of-the-arts over multi-modal information retrieval and computational efficiency. The source codes of this work could be available at https://github. com/JunfengAn1998a/CMCH .

引用

页数：20

共 50 条

[1] Flexible Dual Multi-Modal Hashing for Incomplete Multi-Modal Retrieval
Wei, Yuhong
An, Junfeng
INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2024,
[2] Hashing-based Multi-modal Semantic Communication
Zhu, Yibo
Gu, Hongyu
Nie, Jiangtian
Tang, Jianhang
Jin, Jiangming
Zhang, Yang
2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
[3] Flexible Multi-modal Hashing for Scalable Multimedia Retrieval
Zhu, Lei
Lu, Xu
Cheng, Zhiyong
Li, Jingjing
Zhang, Huaxiang
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2020, 11 (02)
[4] Graph Convolutional Multi-modal Hashing for Flexible Multimedia Retrieval
Lu, Xu
Zhu, Lei
Liu, Li
Nie, Liqiang
Zhang, Huaxiang
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1414 - 1422
[5] Sparse Multi-Modal Hashing
Wu, Fei
Yu, Zhou
Yang, Yi
Tang, Siliang
Zhang, Yin
Zhuang, Yueting
IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (02) : 427 - 439
[6] Bit-aware Semantic Transformer Hashing for Multi-modal Retrieval
Tan, Wentao
Zhu, Lei
Guan, Weili
Li, Jingjing
Cheng, Zhiyong
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 982 - 991
[7] Hadamard matrix-guided multi-modal hashing for multi-modal retrieval
Yu, Jun
Huang, Wei
Li, Zuhe
Shu, Zhenqiu
Zhu, Liang
DIGITAL SIGNAL PROCESSING, 2022, 130
[8] Flexible Online Multi-modal Hashing for Large-scale Multimedia Retrieval
Lu, Xu
Zhu, Lei
Cheng, Zhiyong
Li, Jingjing
Nie, Xiushan
Zhang, Huaxiang
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1129 - 1137
[9] Multi-Facet Weighted Asymmetric Multi-Modal Hashing Based on Latent Semantic Distribution
Lu, Xu
Liu, Li
Ning, Lixin
Zhang, Liang
Mu, Shaomin
Zhang, Huaxiang
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7307 - 7320
[10] Multi-Task Multi-modal Semantic Hashing for Web Image Retrieval with Limited Supervision
Xie, Liang
Zhu, Lei
Cheng, Zhiyong
MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 465 - 477

← 1 2 3 4 5 →