Improved Scene Landmark Detection for Camera Localization

被引:0
|
作者
Do, Tien [1 ]
Sinha, Sudipta N. [2 ]
机构
[1] Tesla, Austin, TX 78725 USA
[2] Microsoft, Redmond, WA USA
关键词
D O I
10.1109/3DV62453.2024.00069
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Camera localization methods based on retrieval, local feature matching, and 3D structure-based pose estimation are accurate but require high storage, are slow, and are not privacy-preserving. A method based on scene landmark detection (SLD) was recently proposed to address these limitations. It involves training a convolutional neural network (CNN) to detect a few predetermined, salient, scene-specific 3D points or landmarks and computing camera pose from the associated 2D-3D correspondences. Although SLD outperformed existing learning-based approaches, it was notably less accurate than 3D structure-based methods. In this paper, we show that the accuracy gap was due to insufficient model capacity and noisy labels during training. To mitigate the capacity issue, we propose to split the landmarks into subgroups and train a separate network for each subgroup. To generate better training labels, we propose using dense reconstructions to estimate visibility of scene landmarks. Finally, we present a compact architecture to improve memory efficiency. Accuracy wise, our approach is on par with state of the art structure-based methods on the INDOOR- 6 dataset but runs significantly faster and uses less storage. Code and models can be found at https://github.com/microsoft/SceneLandmarkLocalization.
引用
收藏
页码:975 / 984
页数:10
相关论文
共 50 条
  • [1] SANet: Scene Agnostic Network for Camera Localization
    Yang, Luwei
    Bai, Ziqian
    Tang, Chengzhou
    Li, Honghua
    Furukawa, Yasutaka
    Tan, Ping
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 42 - 51
  • [2] Learning to Detect Scene Landmarks for Camera Localization
    Do, Tien
    Miksik, Ondrej
    DeGol, Joseph
    Park, Hyun Soo
    Sinha, Sudipta N.
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11122 - 11132
  • [3] Simultaneous Localization and Scene Reconstruction with Monocular Camera
    Huang, Kuo-Chen
    Tseng, Shih-Huan
    Mou, Wei-Hao
    Fu, Li-Chen
    2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2012, : 2102 - 2107
  • [4] Research on a robot landmark localization system based on monocular camera
    Luo Yuan
    Xu Xiaodong
    Zhang Yi
    27TH INTERNATIONAL CONGRESS ON HIGH SPEED PHOTOGRAPHY AND PHOTONICS, PRTS 1-3, 2007, 6279
  • [5] Learning Camera Localization via Dense Scene Matching
    Tang, Shitao
    Tang, Chengzhou
    Huang, Rui
    Zhu, Siyu
    Tan, Ping
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1831 - 1841
  • [6] People Localization in a Camera Network Combining Background Subtraction and Scene-Aware Human Detection
    Lee, Tung-Ying
    Lin, Tsung-Yu
    Huang, Szu-Hao
    Lai, Shang-Hong
    Hung, Shang-Chih
    ADVANCES IN MULTIMEDIA MODELING, PT I, 2011, 6523 : 151 - +
  • [7] Facial Landmark Configuration for Improved Detection
    Huang, C.
    Efraty, B. A.
    Kurkure, U.
    Papadakis, M.
    Shah, S. K.
    Kakadiaris, I. A.
    2012 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2012, : 13 - 18
  • [8] Fast and Accurate Camera Scene Detection on Smartphones
    Pouget, Angeline
    Ramesh, Sidharth
    Giang, Maximilian
    Chandrapalan, Ramithan
    Tanner, Toni
    Prussing, Moritz
    Timofte, Radu
    Ignatov, Andrey
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2569 - 2580
  • [9] Improved Heatmap-Based Landmark Detection
    Yao, Huifeng
    Guo, Ziyu
    Zhang, Yatao
    Li, Xiaomeng
    DEEP GENERATIVE MODELS, AND DATA AUGMENTATION, LABELLING, AND IMPERFECTIONS, 2021, 13003 : 125 - 133
  • [10] Face Detection, Pose Estimation, and Landmark Localization in the Wild
    Zhu, Xiangxin
    Ramanan, Deva
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2879 - 2886