RETHINKING TRAINING OBJECTIVE FOR SELF-SUPERVISED MONOCULAR DEPTH ESTIMATION: SEMANTIC CUES TO RESCUE

被引：0

作者：

Li, Keyao ^{[1
]}

Li, Ge ^{[1
]}

Li, Thomas ^{[2
]}

机构：

[1] Peking Univ, Sch Elect & Comp Engn, Shenzhen Grad Sch, Shenzhen, Peoples R China

[2] Peking Univ, Adv Inst Informat Technol, Hangzhou, Peoples R China

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2021年

关键词：

self-supervised learning; monocular depth estimation; semantic cues;

D O I：

10.1109/ICIP42928.2021.9506744

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Monocular depth estimation finds a wide range of applications in modeling 3D scenes. Since it is expensive to collect ground truth labels to supervise training, plenty of works have been done in a self-supervised manner. A common practice is to train the network optimizing a photometric objective (i.e., view synthesis) due to its effectiveness. However, this training objective is sensitive to optical changes and lacks a consideration of object-level cues, which leads to sub-optimal results in some cases, e.g., artifacts in complex regions and depth discontinuities around thin structures. We summarize them as depth ambiguities. In this paper, we propose an easy yet effective architecture, introducing semantic cues into supervision to solve problems mentioned above. First through our study on the problems we figure out that they are due to the limitation of the commonly applied photometric reconstruction training objective. Then we come up with our method using semantic cues to encode the geometry constraint behind view synthesis. The proposed novel objective is more credible towards confusing pixels, also takes an object-level perception. Experiments show that without introducing extra inference complexity, our method alleviates depth ambiguities greatly and performs comparably with state-of-the-art methods on KITTI benchmark.

引用

页码：3308 / 3312

页数：5

共 50 条

[41] Self-Supervised Human Depth Estimation from Monocular Videos
Tan, Feitong
Zhu, Hao
Cui, Zhaopeng
Zhu, Siyu
Pollefeys, Marc
Tan, Ping
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 647 - 656
[42] Self-Supervised Monocular Depth Estimation with Multi-constraints
Yang, Xinpeng
Zhang, Sen
Zhao, Baoyong
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 8422 - 8427
[43] Self-supervised Monocular Depth Estimation on Unseen Synthetic Cameras
Diana-Albelda, Cecilia
Bravo Perez-Villar, Juan Ignacio
Montalvo, Javier
Garcia-Martin, Alvaro
Bescos Cano, Jesus
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I, 2024, 14469 : 449 - 463
[44] Self-Supervised Deep Monocular Depth Estimation With Ambiguity Boosting
Bello, Juan Luis Gonzalez
Kim, Munchurl
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9131 - 9149
[45] MonoViT: Self-Supervised Monocular Depth Estimation with a Vision Transformer
Zhao, Chaoqiang
Zhang, Youmin
Poggi, Matteo
Tosi, Fabio
Guo, Xianda
Zhu, Zheng
Huang, Guan
Tang, Yang
Mattoccia, Stefano
2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV, 2022, : 668 - 678
[46] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation
Peng, Rui
Wang, Ronggang
Lai, Yawen
Tang, Luyang
Cai, Yangang
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15540 - 15549
[47] Constant Velocity Constraints for Self-Supervised Monocular Depth Estimation
Zhou, Hang
Greenwood, David
Taylor, Sarah
Gong, Han
CVMP 2020: THE 17TH ACM SIGGRAPH EUROPEAN CONFERENCE ON VISUAL MEDIA PRODUCTION, 2020,
[48] Transferring knowledge from monocular completion for self-supervised monocular depth estimation
Sun, Lin
Li, Yi
Liu, Bingzheng
Xu, Liying
Zhang, Zhe
Zhu, Jie
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (29) : 42485 - 42495
[49] Transferring knowledge from monocular completion for self-supervised monocular depth estimation
Lin Sun
Yi Li
Bingzheng Liu
Liying Xu
Zhe Zhang
Jie Zhu
Multimedia Tools and Applications, 2022, 81 : 42485 - 42495
[50] Self-Supervised Monocular Depth Hints
Watson, Jamie
Firman, Michael
Brostow, Gabriel J.
Turmukhambetov, Daniyar
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2162 - 2171

← 1 2 3 4 5 →