Improved Scene Landmark Detection for Camera Localization

被引：0

作者：

Do, Tien ^{[1
]}

Sinha, Sudipta N. ^{[2
]}

机构：

[1] Tesla, Austin, TX 78725 USA

[2] Microsoft, Redmond, WA USA

来源：

2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024 | 2024年

关键词：

D O I：

10.1109/3DV62453.2024.00069

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Camera localization methods based on retrieval, local feature matching, and 3D structure-based pose estimation are accurate but require high storage, are slow, and are not privacy-preserving. A method based on scene landmark detection (SLD) was recently proposed to address these limitations. It involves training a convolutional neural network (CNN) to detect a few predetermined, salient, scene-specific 3D points or landmarks and computing camera pose from the associated 2D-3D correspondences. Although SLD outperformed existing learning-based approaches, it was notably less accurate than 3D structure-based methods. In this paper, we show that the accuracy gap was due to insufficient model capacity and noisy labels during training. To mitigate the capacity issue, we propose to split the landmarks into subgroups and train a separate network for each subgroup. To generate better training labels, we propose using dense reconstructions to estimate visibility of scene landmarks. Finally, we present a compact architecture to improve memory efficiency. Accuracy wise, our approach is on par with state of the art structure-based methods on the INDOOR- 6 dataset but runs significantly faster and uses less storage. Code and models can be found at https://github.com/microsoft/SceneLandmarkLocalization.

引用

页码：975 / 984

页数：10

共 50 条

[21] Reconstruction Network for single-face detection and landmark localization
Bo Yu
Ian Lane
Fang Chen
Optical and Quantum Electronics, 2017, 49
[22] ST-PixLoc: A Scene-Agnostic Network for Enhanced Camera Localization
Wang, Jing
Wang, Yibo
Jin, Yuchu
Guo, Cheng
Fan, Xuhui
IEEE ACCESS, 2024, 12 : 105294 - 105308
[23] ATTENTION-GUIDED CASCADED NETWORKS FOR IMPROVED FACE DETECTION AND LANDMARK LOCALIZATION UNDER LOW-LIGHT CONDITIONS
Oludare, Victor
Kezebou, Landry
Panetta, Karen
Agaian, Sos
MOBILE MULTIMEDIA/IMAGE PROCESSING, SECURITY, AND APPLICATIONS 2020, 2020, 11399
[24] Scene Text Localization and Recognition with Oriented Stroke Detection
Neumann, Lukas
Matas, Jiri
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 97 - 104
[25] Text detection and localization in scene images: a broad review
Shilpa Mahajan
Rajneesh Rani
Artificial Intelligence Review, 2021, 54 : 4317 - 4377
[26] Text detection and localization in scene images: a broad review
Mahajan, Shilpa
Rani, Rajneesh
ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (06) : 4317 - 4377
[27] Explore Faster Localization Learning For Scene Text Detection
Zhao, Yuzhong
Cai, Yuanqiang
Wu, Weijia
Wang, Weiqiang
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 156 - 161
[28] Improved localization accuracy by LocNet for Faster R-CNN based text detection in natural scene images
Zhong, Zhuoyao
Sun, Lei
Huo, Qiang
PATTERN RECOGNITION, 2019, 96
[29] Lightweight facial landmark detection network based on improved MobileViT
Song, Limei
Hong, Chuanfei
Gao, Tian
Yu, Jiali
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3123 - 3131
[30] Lightweight facial landmark detection network based on improved MobileViT
Limei Song
Chuanfei Hong
Tian Gao
Jiali Yu
Signal, Image and Video Processing, 2024, 18 : 3123 - 3131

← 1 2 3 4 5 →