Semantic Reinforced Attention Learning for Visual Place Recognition

被引:29
|
作者
Peng, Guohao [1 ]
Yue, Yufeng [2 ]
Zhang, Jun [1 ]
Wu, Zhenyu [1 ]
Tang, Xiaoyu [1 ]
Wang, Danwei [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Beijing Inst Technol, Sch Automat, Beijing 100081, Peoples R China
关键词
LARGE-SCALE; FAB-MAP; LOCALIZATION; SLAM;
D O I
10.1109/ICRA48506.2021.9561812
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large-scale visual place recognition (VPR) is inherently challenging because not all visual cues in the image are beneficial to the task. In order to highlight the task-relevant visual cues in the feature embedding, the existing attention mechanisms are either based on artificial rules or trained in a thorough data-driven manner. To fill the gap between the two types, we propose a novel Semantic Reinforced Attention Learning Network (SRALNet), in which the inferred attention can benefit from both semantic priors and data-driven fine-tuning. The contribution lies in two-folds. (1) To suppress misleading local features, an interpretable local weighting scheme is proposed based on hierarchical feature distribution. (2) By exploiting the interpretability of the local weighting scheme, a semantic constrained initialization is proposed so that the local attention can be reinforced by semantic priors. Experiments demonstrate that our method outperforms state-of-the-art techniques on city-scale VPR benchmark datasets.
引用
收藏
页码:13415 / 13422
页数:8
相关论文
共 50 条
  • [31] Visual and semantic ensemble for scene text recognition with gated dual mutual attention
    Liu, Zhiguang
    Wang, Liangwei
    Qiao, Jian
    [J]. INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2022, 11 (04) : 669 - 680
  • [32] Better Deep Visual Attention with Reinforcement Learning in Action Recognition
    Wang, Gang
    Wang, Wenmin
    Wang, Jingzhuo
    Bu, Yaohua
    [J]. 2017 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2017,
  • [33] Semantic-aligned reinforced attention model for zero-shot learning
    Yang, Zaiquan
    Zhang, Yuqi
    Du, Yuxin
    Tong, Chao
    [J]. IMAGE AND VISION COMPUTING, 2022, 128
  • [34] Semantic-geometric visual place recognition: a new perspective for reconciling opposing views
    Garg, Sourav
    Suenderhauf, Niko
    Milford, Michael
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2022, 41 (06): : 573 - 598
  • [35] Visual place recognition from end-to-end semantic scene text features
    Raisi, Zobeir
    Zelek, John
    [J]. FRONTIERS IN ROBOTICS AND AI, 2024, 11
  • [36] R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
    Lu, Pan
    Ji, Lei
    Zhang, Wei
    Duan, Nan
    Zhou, Ming
    Wang, Jianyong
    [J]. KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 1880 - 1889
  • [37] Learning discriminative visual semantic embedding for zero-shot recognition
    Xie, Yurui
    Song, Tiecheng
    Yuan, Jianying
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 115
  • [38] Where Is Your Place, Visual Place Recognition?
    Garg, Sourav
    Fischer, Tobias
    Milford, Michael
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4416 - 4425
  • [39] Jointly Learning Visual Poses and Pose Lexicon for Semantic Action Recognition
    Zhou, Lijuan
    Li, Wanqing
    Ogunbona, Philip
    Zhang, Zhengyou
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (02) : 457 - 467
  • [40] Multimodal Visual-Semantic Representations Learning for Scene Text Recognition
    Gao, Xinjian
    Pang, Ye
    Liu, Yuyu
    Han, Maokun
    Yu, Jun
    Wang, Wei
    Chen, Yuanxu
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (07)