Revisit Anything: Visual Place Recognition via Image Segment Retrieval

被引:0
|
作者
Garg, Kartik [1 ]
Shubodh, Sai [2 ]
Kolathaya, Shishir [1 ]
Krishna, Madhava [2 ]
Garg, Sourav [3 ]
机构
[1] Indian Inst Sci IISc, Bengaluru, India
[2] Int Inst Informat Technol, Hyderabad, India
[3] Univ Adelaide, Adelaide, SA, Australia
来源
关键词
Visual Place Recognition; Image Segmentation; Robotics; SCALE;
D O I
10.1007/978-3-031-73113-6_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accurately recognizing a revisited place is crucial for embodied agents to localize and navigate. This requires visual representations to be distinct, despite strong variations in camera viewpoint and scene appearance. Existing visual place recognition pipelines encode the whole image and search for matches. This poses a fundamental challenge in matching two images of the same place captured from different camera viewpoints: the similarity of what overlaps can be dominated by the dissimilarity of what does not overlap. We address this by encoding and searching for image segments instead of the whole images. We propose to use open-set image segmentation to decompose an image into 'meaningful' entities (i.e., things and stuff). This enables us to create a novel image representation as a collection of multiple overlapping subgraphs connecting a segment with its neighboring segments, dubbed SuperSegment. Furthermore, to efficiently encode these SuperSegments into compact vector representations, we propose a novel factorized representation of feature aggregation. We show that retrieving these partial representations leads to significantly higher recognition recall than the typical whole image based retrieval. Our segments-based approach, dubbed SegVLAD, sets a new state-of-the-art in place recognition on a diverse selection of benchmark datasets, while being applicable to both generic and task-specialized image encoders. Finally, we demonstrate the potential of our method to "revisit anything" by evaluating our method on an object instance retrieval task, which bridges the two disparate areas of research: visual place recognition and object-goal navigation, through their common aim of recognizing goal objects specific to a place. Source code: https://github.com/AnyLoc/Revisit-Anything.
引用
收藏
页码:326 / 343
页数:18
相关论文
共 50 条
  • [31] Intelligent Reference Curation for Visual Place Recognition Via Bayesian Selective Fusion
    Molloy, Timothy L.
    Fischer, Tobias
    Milford, Michael
    Nair, Girish N.
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) : 588 - 595
  • [32] SamSLAM: A Visual SLAM Based on Segment Anything Model for Dynamic Environment
    Chen, Xianhao
    Wang, Tengyue
    Mai, Haonan
    Yang, Liangjing
    2024 8TH INTERNATIONAL CONFERENCE ON ROBOTICS, CONTROL AND AUTOMATION, ICRCA 2024, 2024, : 91 - 97
  • [33] Active learning for image retrieval via visual similarity metrics and semantic features
    Casado-Coscolla, Alvaro
    Sanchez-Belenguer, Carlos
    Wolfart, Erik
    Angorrilla-Bustamante, Carlos
    Sequeira, Vitor
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 138
  • [34] The potential of 'Segment Anything' (SAM) for universal intelligent ultrasound image guidance
    Ning, Guochen
    Liang, Hanyin
    Jiang, Zhongliang
    Zhang, Hui
    Liao, Hongen
    BIOSCIENCE TRENDS, 2023, 17 (03) : 230 - 233
  • [35] Drilling rock image segmentation and analysis using segment anything model
    Shan, Liqun
    Liu, Yanchang
    Du, Ke
    Paul, Shovon
    Zhang, Xingli
    Hei, Xiali
    ADVANCES IN GEO-ENERGY RESEARCH, 2024, 12 (02): : 89 - 101
  • [36] Clustering in image space for place recognition and visual annotations for human-robot interaction
    Martínez, AM
    Vitrià, J
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2001, 31 (05): : 669 - 682
  • [37] A Robust Image-Sequence-Based Framework for Visual Place Recognition in Changing Environments
    Wang, Yong
    Xue, Taolue
    Li, Qin
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (01) : 152 - 163
  • [38] Unsupervised change detection based on image reconstruction loss with segment anything
    Noh, Hyeoncheol
    Ju, Jingi
    Kim, Yuhyun
    Kim, Minwoo
    Choi, Dong-Geol
    REMOTE SENSING LETTERS, 2024, 15 (09) : 919 - 929
  • [39] GazeSAM: Interactive Image Segmentation with Eye Gaze and Segment Anything Model
    Wang, Bin
    Aboah, Armstrong
    Zhang, Zheyuan
    Pan, Hongyi
    Bagci, Ulas
    GAZE MEETS MACHINE LEARNING WORKSHOP, 2023, 226 : 254 - 264
  • [40] EyeSAM: Unveiling the Potential of Segment Anything Model in Ophthalmic Image Segmentation
    da Silva, Alan Sousa
    Naik, Gunjan
    Bagga, Pallavi
    Soornro, Taha
    Reis, Ana P. Ribeiro
    Zhang, Gongyu
    Waisberg, Ethan
    Kandakji, Lynn
    Liu, Siyin
    Fu, Dun Jack
    Woof, Wiliam
    Moghul, Ismail
    Balaskas, Konstantinos
    Pontikos, Nikolas
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2024, 65 (07)