A Quantum-Inspired Framework in Leader-Servant Mode for Large-Scale Multi-Modal Place Recognition

被引：0

作者：

Zhang, Ruonan ^{[1
]}

Li, Ge ^{[2
]}

Gao, Wei ^{[3
,4
]}

Liu, Shan ^{[5
]}

机构：

[1] Ningxia Univ, Sch Adv Interdisciplinary Studies, Zhongwei 755000, Peoples R China

[2] Peking Univ, Sch Elect & Comp Engn SECE, Shenzhen Grad Sch, Guangdong Prov Key Lab Ultra High Definit Immers M, Shenzhen 518055, Peoples R China

[3] Peking Univ, Sch Elect & Comp Engn SECE, Shenzhen Grad Sch, Guangdong Prov Key Lab Ultra High Definit Immers M, Shenzhen 518055, Peoples R China

[4] Peng Cheng Natl Lab, Shenzhen 518066, Peoples R China

[5] Tencent, Media Lab, Palo Alto, CA 94301 USA

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2025年 / 26卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Training; Point cloud compression; Feature extraction; Interference; Wave functions; Quantum mechanics; Image recognition; Fuses; Convolution; Three-dimensional displays; Multi-modal; place recognition; 3D point cloud; image; feature fusion;

D O I：

10.1109/TITS.2024.3497574

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Multi-modal place recognition aims to grasp diversified information implied in different modalities to bring vitality to place recognition tasks. The key challenge is rooted in the representation gap in modalities, the feature fusion method, and their relationships. The majority of existing methods are based on uni-modal, leaving these challenges unsolved effectively. To address the problems, encouraged by double-split experiments in physics and cooperation modes, in this paper, we introduce a leader-servant multi-modal framework inspired by quantum theory for large-scale place recognition. Two key modules are designed, a quantum representation module and an interference-aware fusion module. The former is designed for multi-modal data to capture their diversity and bridge the gap, while the latter is proposed to effectively fuse the multi-modal feature with the guidance of the quantum theory. Besides, we propose a leader-servant training strategy for stable training, where three cases are considered with the multi-modal loss as the leader to preserve overall characteristics and other uni-modal losses as the servants to lighten the modality influence of the leader. Furthermore, The framework is compatible with uni-modal place recognition. At last, The experiments on three datasets witness the efficiency, generalization, and robustness of the proposed method in contrast to the other existing methods.

引用

页码：2027 / 2039

页数：13

共 50 条

[31] Beyond Entities: A Large-Scale Multi-Modal Knowledge Graph with Triplet Fact Grounding
Liu, Jingping
Zhang, Mingchuan
Li, Weichen
Wang, Chao
Li, Shuang
Jiang, Haiyun
Jiang, Sihang
Xiao, Yanghua
Chen, Yunwen
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 18653 - 18661
[32] Multi-Modal Learning: Study on A Large-Scale Micro-Video Data Collection
Chen, Jingyuan
MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, : 1454 - 1458
[33] GEMSim: A GPU-accelerated multi-modal mobility simulator for large-scale scenarios
Saprykin, Aleksandr
Chokani, Ndaona
Abhari, Reza S.
SIMULATION MODELLING PRACTICE AND THEORY, 2019, 94 : 199 - 214
[34] Leveraging Multi-modal Prior Knowledge for Large-scale Concept Learning in NoisyWeb Data
Liang, Junwei
Jiang, Lu
Meng, Deyu
Hauptmann, Alexander
PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, : 32 - 40
[35] A novel multi-modal incremental tensor decomposition for anomaly detection in large-scale networks
Fan, Rongqiao
Fan, Qiyuan
Li, Xue
Wang, Puming
Xu, Jing
Jin, Xin
Yao, Shaowen
Liu, Peng
INFORMATION SCIENCES, 2024, 681
[36] REINFORCE: rapid augmentation of large-scale multi-modal transport networks for resilience enhancement
Elise Henry
Angelo Furno
Nour-Eddin El Faouzi
Applied Network Science, 6
[37] BEAT: A Large-Scale Semantic and Emotional Multi-modal Dataset for Conversational Gestures Synthesis
Liu, Haiyang
Zhu, Zihao
Iwamoto, Naoya
Peng, Yichen
Li, Zhengqing
Zhou, You
Bozkurt, Elif
Zheng, Bo
COMPUTER VISION, ECCV 2022, PT VII, 2022, 13667 : 612 - 630
[38] Multi-modal artificial dura for simultaneous large-scale optical access and large-scale electrophysiology in non-human primate cortex
Griggs, Devon J.
Khateeb, Karam
Zhou, Jasmine
Liu, Teng
Wang, Ruikang
Yazdan-Shahmorad, Azadeh
JOURNAL OF NEURAL ENGINEERING, 2021, 18 (05)
[39] A Hybrid Quantum-Inspired Particle Swarm Evolution Algorithm and SQP Method for Large-Scale Economic Dispatch Problems
Niu, Qun
Zhou, Zhuo
Zeng, Tingting
BIO-INSPIRED COMPUTING AND APPLICATIONS, 2012, 6840 : 207 - 214
[40] IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments
Soliman, Abanob
Bonardi, Fabien
Sidibe, Desire
Bouchafa, Samia
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2022, 106 (03)

← 1 2 3 4 5 →