Enhancing Visual Place Recognition With Hybrid Attention Mechanisms in MixVPR

被引:0
|
作者
Hu, Jun [1 ]
Nie, Jiwei [1 ,2 ,4 ]
Ning, Zuotao [1 ]
Feng, Chaolu [3 ]
Wang, Luyang [1 ]
Li, Jingyao [1 ]
Cheng, Shuai [1 ]
机构
[1] Neusoft Reach Automot Technol Shenyang Co Ltd, Shenyang 110179, Peoples R China
[2] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110819, Peoples R China
[3] Minist Educ, Key Lab Intelligent Comp Med Image, Shenyang 110169, Peoples R China
[4] Northeastern Univ, Software Coll, Shenyang 110819, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Feature extraction; Transformers; Training; Vectors; Mixers; Attention mechanisms; Pipelines; Frequency modulation; Deep learning; Convolutional neural networks; Visual place recognition; SLAM; autonomous driving; deep learning; vision transformer; attention mechanism;
D O I
10.1109/ACCESS.2024.3487171
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Visual Place Recognition (VPR) is a fundamental task in robotics and computer vision, where the ability to recognize locations from visual inputs is crucial for autonomous navigation systems. Traditional methods, which rely on handcrafted features or standard convolutional neural networks (CNNs), struggle with environmental changes that significantly alter a place's appearance. Recent advancements in deep learning have improved VPR by focusing on deep-learned features, enhancing robustness under varying conditions. However, these methods often overlook saliency cues, leading to inefficiencies in dynamic scenes. To address these limitations, we propose an improved MixVPR model that incorporates both self-attention and cross-attention mechanisms through a spatial-wise hybrid attention mechanism. This enhancement integrates spatial saliency cues into the global image embedding, improving accuracy and reliability. We also utilize the DINOv2 visual transformer for robust feature extraction. Extensive experiments on mainstream VPR benchmarks demonstrate that our method achieves superior performance while maintaining computational efficiency. Ablation studies and visualizations further validate the contributions of our attention mechanisms to the model's performance improvement.
引用
收藏
页码:159847 / 159859
页数:13
相关论文
共 50 条
  • [41] Joint modelling of audio-visual cues using attention mechanisms for emotion recognition
    Esam Ghaleb
    Jan Niehues
    Stylianos Asteriadis
    Multimedia Tools and Applications, 2023, 82 : 11239 - 11264
  • [42] Hybrid Approach in Recognition of Visual Covert Selective Spatial Attention based on MEG Signals
    Hosseini, S. A.
    Akbarzadeh-T, M. -R.
    Naghibi-Sistani, M. -B.
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [43] Infant visual attention and object recognition
    Reynolds, Greg D.
    BEHAVIOURAL BRAIN RESEARCH, 2015, 285 : 34 - 43
  • [44] Melanoma Recognition via Visual Attention
    Yan, Yiqi
    Kawahara, Jeremy
    Hamarneh, Ghassan
    INFORMATION PROCESSING IN MEDICAL IMAGING, IPMI 2019, 2019, 11492 : 793 - 804
  • [45] Causal Attention for Unbiased Visual Recognition
    Wang, Tan
    Zhou, Chang
    Sun, Qianru
    Zhang, Hanwang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3071 - 3080
  • [46] Semantic-guided de-attention with sharpened triplet marginal loss for visual place recognition
    Choi, Seung-Min
    Lee, Seung-Ik
    Lee, Jae-Yeong
    Kweon, In So
    PATTERN RECOGNITION, 2023, 141
  • [47] Visual place recognition method based on parallel omni-dimensional-dimensional dynamic attention mechanism
    Liu, Peijin
    Liu, Shujie
    He, Lin
    Peng, Lijun
    Fu, Xuefeng
    CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2024, 39 (09) : 1233 - 1242
  • [48] Cognitive mechanisms of nicotine on visual attention
    Lawrence, NS
    Ross, TJ
    Stein, EA
    NEURON, 2002, 36 (03) : 539 - 548
  • [49] Posterior parietal mechanisms of visual attention
    Constantinidis, Christos
    REVIEWS IN THE NEUROSCIENCES, 2006, 17 (04) : 415 - 427
  • [50] Neural Mechanisms of Selective Visual Attention
    Moore, Tirin
    Zirnsak, Marc
    ANNUAL REVIEW OF PSYCHOLOGY, VOL 68, 2017, 68 : 47 - 72