Self-Supervised Monocular Depth and Motion Learning in Dynamic Scenes: Semantic Prior to Rescue

被引:0
|
作者
Seokju Lee
Francois Rameau
Sunghoon Im
In So Kweon
机构
[1] Korea Institute of Energy Technology (KENTECH),School of Energy Engineering
[2] Korea Advanced Institute of Science and Technology (KAIST),School of Electrical Engineering
[3] Daegu Gyeongbuk Institute of Science and Technology (DGIST),Department of Information and Communication Engineering
来源
关键词
3D visual perception; Monocular depth estimation; Motion estimation; Self-supervised learning;
D O I
暂无
中图分类号
学科分类号
摘要
We introduce an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion, and depth in a monocular camera setup without geometric supervision. Our technical contributions are three-fold. First, we highlight the fundamental difference between inverse and forward projection while modeling the individual motion of each rigid object, and propose a geometrically correct projection pipeline using a neural forward projection module. Second, we propose two types of residual motion learning frameworks to explicitly disentangle camera and object motions in dynamic driving scenes with different levels of semantic prior knowledge: video instance segmentation as a strong prior, and object detection as a weak prior. Third, we design a unified photometric and geometric consistency loss that holistically imposes self-supervisory signals for every background and object region. Lastly, we present a unsupervised method of 3D motion field regularization for semantically plausible object motion representation. Our proposed elements are validated in a detailed ablation study. Through extensive experiments conducted on the KITTI, Cityscapes, and Waymo open dataset, our framework is shown to outperform the state-of-the-art depth and motion estimation methods. Our code, dataset, and models are publicly available
引用
收藏
页码:2265 / 2285
页数:20
相关论文
共 50 条
  • [1] Self-Supervised Monocular Depth and Motion Learning in Dynamic Scenes: Semantic Prior to Rescue
    Lee, Seokju
    Rameau, Francois
    Im, Sunghoon
    Kweon, In So
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (09) : 2265 - 2285
  • [2] Self-supervised monocular depth estimation in dynamic scenes based on deep learning
    Cheng, Binbin
    Yu, Ying
    Zhang, Lei
    Wang, Ziquan
    Jiang, Zhipeng
    [J]. National Remote Sensing Bulletin, 2024, 28 (09) : 2170 - 2186
  • [3] Self-supervised monocular depth estimation on water scenes via specular reflection prior
    Lu, Zhengyang
    Chen, Ying
    [J]. DIGITAL SIGNAL PROCESSING, 2024, 149
  • [4] Self-Supervised Multi-Frame Monocular Depth Estimation for Dynamic Scenes
    Wu, Guanghui
    Liu, Hao
    Wang, Longguang
    Li, Kunhong
    Guo, Yulan
    Chen, Zengping
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) : 4989 - 5001
  • [5] Self-supervised monocular depth estimation in dynamic scenes with moving instance loss
    Yue, Min
    Fu, Guangyuan
    Wu, Ming
    Zhang, Xin
    Gu, Hongyang
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 112
  • [6] Self-Supervised Learning for Monocular Depth Estimation on Minimally Invasive Surgery Scenes
    Shao, Shuwei
    Pei, Zhongcai
    Chen, Weihai
    Zhang, Baochang
    Wu, Xingming
    Sun, Dianmin
    Doermann, David
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 7159 - 7165
  • [7] RETHINKING TRAINING OBJECTIVE FOR SELF-SUPERVISED MONOCULAR DEPTH ESTIMATION: SEMANTIC CUES TO RESCUE
    Li, Keyao
    Li, Ge
    Li, Thomas
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3308 - 3312
  • [8] Graph semantic information for self-supervised monocular depth estimation
    Zhang, Dongdong
    Wang, Chunping
    Wang, Huiying
    Fu, Qiang
    [J]. PATTERN RECOGNITION, 2024, 156
  • [9] Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue
    Shao, Shuwei
    Pei, Zhongcai
    Chen, Weihai
    Zhu, Wentao
    Wu, Xingming
    Sun, Dianmin
    Zhang, Baochang
    [J]. MEDICAL IMAGE ANALYSIS, 2022, 77
  • [10] SC-DepthV3: Robust Self-Supervised Monocular Depth Estimation for Dynamic Scenes
    Sun, Libo
    Bian, Jia-Wang
    Zhan, Huangying
    Yin, Wei
    Reid, Ian
    Shen, Chunhua
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (01) : 497 - 508