Enhancing Visual Place Recognition With Hybrid Attention Mechanisms in MixVPR
被引:0
|
作者:
Hu, Jun
论文数: 0引用数: 0
h-index: 0
机构:
Neusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R ChinaNeusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R China
Hu, Jun
[1
]
Nie, Jiwei
论文数: 0引用数: 0
h-index: 0
机构:
Neusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R China
Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110819, Peoples R China
Northeastern Univ, Software Coll, Shenyang 110819, Peoples R ChinaNeusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R China
Nie, Jiwei
[1
,2
,4
]
Ning, Zuotao
论文数: 0引用数: 0
h-index: 0
机构:
Neusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R ChinaNeusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R China
Ning, Zuotao
[1
]
Feng, Chaolu
论文数: 0引用数: 0
h-index: 0
机构:
Minist Educ, Key Lab Intelligent Comp Med Image, Shenyang 110169, Peoples R ChinaNeusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R China
Feng, Chaolu
[3
]
Wang, Luyang
论文数: 0引用数: 0
h-index: 0
机构:
Neusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R ChinaNeusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R China
Wang, Luyang
[1
]
Li, Jingyao
论文数: 0引用数: 0
h-index: 0
机构:
Neusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R ChinaNeusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R China
Li, Jingyao
[1
]
Cheng, Shuai
论文数: 0引用数: 0
h-index: 0
机构:
Neusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R ChinaNeusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R China
Cheng, Shuai
[1
]
机构:
[1] Neusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R China
[2] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110819, Peoples R China
[3] Minist Educ, Key Lab Intelligent Comp Med Image, Shenyang 110169, Peoples R China
[4] Northeastern Univ, Software Coll, Shenyang 110819, Peoples R China
Feature extraction;
Transformers;
Training;
Vectors;
Mixers;
Attention mechanisms;
Pipelines;
Frequency modulation;
Deep learning;
Convolutional neural networks;
Visual place recognition;
SLAM;
autonomous driving;
deep learning;
vision transformer;
attention mechanism;
D O I:
10.1109/ACCESS.2024.3487171
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Visual Place Recognition (VPR) is a fundamental task in robotics and computer vision, where the ability to recognize locations from visual inputs is crucial for autonomous navigation systems. Traditional methods, which rely on handcrafted features or standard convolutional neural networks (CNNs), struggle with environmental changes that significantly alter a place's appearance. Recent advancements in deep learning have improved VPR by focusing on deep-learned features, enhancing robustness under varying conditions. However, these methods often overlook saliency cues, leading to inefficiencies in dynamic scenes. To address these limitations, we propose an improved MixVPR model that incorporates both self-attention and cross-attention mechanisms through a spatial-wise hybrid attention mechanism. This enhancement integrates spatial saliency cues into the global image embedding, improving accuracy and reliability. We also utilize the DINOv2 visual transformer for robust feature extraction. Extensive experiments on mainstream VPR benchmarks demonstrate that our method achieves superior performance while maintaining computational efficiency. Ablation studies and visualizations further validate the contributions of our attention mechanisms to the model's performance improvement.
机构:
Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City
Vietnam National University Ho Chi Minh City (VNU-HCM), Ho Chi Minh CityHo Chi Minh City University of Technology (HCMUT), Ho Chi Minh City
Quach M.-D.
Vo D.-M.
论文数: 0引用数: 0
h-index: 0
机构:
Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City
Vietnam National University Ho Chi Minh City (VNU-HCM), Ho Chi Minh CityHo Chi Minh City University of Technology (HCMUT), Ho Chi Minh City
Vo D.-M.
Pham H.-A.
论文数: 0引用数: 0
h-index: 0
机构:
Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City
Vietnam National University Ho Chi Minh City (VNU-HCM), Ho Chi Minh CityHo Chi Minh City University of Technology (HCMUT), Ho Chi Minh City
机构:
Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing 100190, Peoples R China
Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R ChinaChinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing 100190, Peoples R China
Guan, Peiyu
Cao, Zhiqiang
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing 100190, Peoples R China
Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R ChinaChinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing 100190, Peoples R China
Cao, Zhiqiang
Fan, Shengxuan
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing 100190, Peoples R China
Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R ChinaChinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing 100190, Peoples R China
Fan, Shengxuan
Yang, Yuequan
论文数: 0引用数: 0
h-index: 0
机构:
Yangzhou Univ, Coll Informat Engn, Yangzhou 225127, Peoples R China
Yangzhou Univ, Coll Artificial Intelligence, Yangzhou 225127, Peoples R ChinaChinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing 100190, Peoples R China
Yang, Yuequan
Yu, Junzhi
论文数: 0引用数: 0
h-index: 0
机构:
Peking Univ, Coll Engn, Dept Adv Mfg & Robot, BIC ESAT, Beijing 100871, Peoples R ChinaChinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing 100190, Peoples R China
Yu, Junzhi
Wang, Shuo
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing 100190, Peoples R China
Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R ChinaChinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing 100190, Peoples R China