Self-Supervised Human Depth Estimation from Monocular Videos

被引:19
|
作者
Tan, Feitong [1 ]
Zhu, Hao [2 ]
Cui, Zhaopeng [3 ]
Zhu, Siyu [4 ]
Pollefeys, Marc [3 ]
Tan, Ping [1 ]
机构
[1] Simon Fraser Univ, Burnaby, BC, Canada
[2] Nanjing Univ, Nanjing, Jiangsu, Peoples R China
[3] Swiss Fed Inst Technol, Zurich, Switzerland
[4] Alibaba AI Labs, Hangzhou, Zhejiang, Peoples R China
关键词
SHAPE;
D O I
10.1109/CVPR42600.2020.00073
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous methods on estimating detailed human depth often require supervised training with 'ground truth' depth data. This paper presents a self-supervised method that can be trained on YouTube videos without known depth, which makes training data collection simple and improves the generalization of the learned network. The self-supervised learning is achieved by minimizing a photo-consistency loss, which is evaluated between a video frame and its neighboring frames warped according to the estimated depth and the 3D non-rigid motion of the human body. To solve this non-rigid motion, we first estimate a rough SMPL model at each video frame and compute the non-rigid body motion accordingly, which enables self-supervised learning on estimating the shape details. Experiments demonstrate that our method enjoys better generalization and performs much better on data in the wild.
引用
收藏
页码:647 / 656
页数:10
相关论文
共 50 条
  • [21] Self-supervised learning monocular depth estimation from internet photos
    Lin, Xiaocan
    Li, Nan
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 99
  • [22] A Self-Supervised Network-Based Smoke Removal and Depth Estimation for Monocular Endoscopic Videos
    Zhang, Guo
    Gao, Xinbo
    Meng, Hongying
    Pang, Yu
    Nie, Xixi
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (09) : 6547 - 6559
  • [23] SELF-SUPERVISED DEPTH ESTIMATION VIA IMPLICIT CUES FROM VIDEOS
    Wang, Jianrong
    Zhang, Ge
    Wu, Zhenyu
    Li, Xuewei
    Liu, Li
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2485 - 2489
  • [24] Depth Estimation for Colonoscopy Images with Self-supervised Learning from Videos
    Cheng, Kai
    Ma, Yiting
    Sun, Bin
    Li, Yang
    Chen, Xuejin
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VI, 2021, 12906 : 119 - 128
  • [25] Monocular Depth Estimation via Self-Supervised Self-Distillation
    Hu, Haifeng
    Feng, Yuyang
    Li, Dapeng
    Zhang, Suofei
    Zhao, Haitao
    SENSORS, 2024, 24 (13)
  • [26] Frequency-Aware Self-Supervised Monocular Depth Estimation
    Chen, Xingyu
    Li, Thomas H.
    Zhang, Ruonan
    Li, Ge
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5797 - 5806
  • [27] MonoVAN: Visual Attention for Self-Supervised Monocular Depth Estimation
    Indyk, Ilia
    Makarov, Ilya
    2023 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY, ISMAR, 2023, : 1211 - 1220
  • [28] Self-Supervised Monocular Depth Estimation by Digging into Uncertainty Quantification
    Li, Yuan-Zhen
    Zheng, Sheng-Jie
    Tan, Zi-Xin
    Cao, Tuo
    Luo, Fei
    Xiao, Chun-Xia
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2023, 38 (03) : 510 - 525
  • [29] Self-supervised monocular image depth learning and confidence estimation
    Chen, Long
    Tang, Wen
    Wan, Tao Ruan
    John, Nigel W.
    NEUROCOMPUTING, 2020, 381 : 272 - 281
  • [30] Self-supervised Learning for Dense Depth Estimation in Monocular Endoscopy
    Liu, Xingtong
    Sinha, Ayushi
    Unberath, Mathias
    Ishii, Masaru
    Hager, Gregory D.
    Taylor, Russell H.
    Reiter, Austin
    OR 2.0 CONTEXT-AWARE OPERATING THEATERS, COMPUTER ASSISTED ROBOTIC ENDOSCOPY, CLINICAL IMAGE-BASED PROCEDURES, AND SKIN IMAGE ANALYSIS, OR 2.0 2018, 2018, 11041 : 128 - 138