OnionNet: Single-View Depth Prediction and Camera Pose Estimation for Unlabeled Video

被引:4
|
作者
Gu, Tianhao [1 ,2 ]
Wang, Zhe [1 ,2 ]
Li, Dongdong [2 ]
Yang, Hai [2 ]
Du, Wenli [1 ]
Zhou, Yangming [2 ]
机构
[1] East China Univ Sci & Technol, Key Lab Adv Control & Optimizat Chem Proc, Minist Educ, Shanghai 200237, Peoples R China
[2] East China Univ Sci & Technol, Dept Comp Sci & Engn, Shanghai 200237, Peoples R China
基金
美国国家科学基金会;
关键词
Cameras; Training; Pose estimation; Geometry; Robustness; Task analysis; Decoding; Camera pose estimation; multitask learning; single-view depth prediction; unsupervised learning; LOCALIZATION; SLAM;
D O I
10.1109/TCDS.2020.3042521
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real scenes, humans can easily infer their positions and distances from other objects with their own eyes. To make the robots have the same visual ability, this article presents an unsupervised OnionNet framework, including LeafNet and ParachuteNet, for single-view depth prediction and camera pose estimation. In OnionNet, for speeding up OnionNet's convergence and concretizing objects against the gradient locality and moving objects in videos, LeafNet adopts two decoders and enhanced upconvolution modules. Meanwhile, for improving the robustness of fast camera movement and rotation, ParachuteNet uses and integrates three pose networks to estimate multiview camera pose parameters by combining with the modified image preprocess. Different from existing methods, single-view depth prediction and camera pose estimation are trained view by view, where the variations between views is gradual reduction of view range and outer pixels disappear in next view, similar to onion peeling. Moreover, the LeafNet is optimized with pose parameter from each pose network in turn. Experimental results on the KITTI data set show the outstanding effectiveness of our method: single-view depth performs better than most supervised and unsupervised methods which contain two same subtasks, and pose estimation gets the state-of-the-art performance compared with existing methods under the comparable input settings.
引用
收藏
页码:995 / 1009
页数:15
相关论文
共 50 条
  • [31] A fast closed-form solution for single-view pose determination
    John, Ben St.
    Madlmayr, Gerald
    PROCEEDINGS OF THE 5TH INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS, 2007, : 125 - 129
  • [32] Camera Pose Matters: Improving Depth Prediction by Mitigating Pose Distribution Bias
    Zhao, Yunhan
    Kong, Shu
    Fowlkes, Charless
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15754 - 15763
  • [33] Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving
    Cheng, Junda
    Yin, Wei
    Wang, Kaixuan
    Chen, Xiaozhi
    Wang, Shijie
    Yang, Xin
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10138 - 10147
  • [34] Estimation of the depth information from single view image sequence with camera translation
    Park, JH
    Kim, SG
    Kim, HW
    Yoon, YW
    Jeoune, DS
    CISST'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGING SCIENCE, SYSTEMS, AND TECHNOLOGY, VOLS I AND II, 2000, : 179 - 183
  • [35] Depth Estimation in Image Sequences in Single-Camera Video Surveillance Systems
    Lamza, Aleksander
    Wrobel, Zygmunt
    Dziech, Andrzej
    MULTIMEDIA COMMUNICATIONS, SERVICES AND SECURITY, MCSS 2013, 2013, 368 : 121 - 129
  • [36] Fast and automatic periacetabular osteotomy fragment pose estimation using intraoperatively implanted fiducials and single-view fluoroscopy
    Grupp, R. B.
    Murphy, R. J.
    Hegeman, R. A.
    Alexander, C. P.
    Unberath, M.
    Otake, Y.
    McArthur, B. A.
    Armand, M.
    Taylor, R. H.
    PHYSICS IN MEDICINE AND BIOLOGY, 2020, 65 (24):
  • [37] DEPTH MAP ESTIMATION FROM SINGLE-VIEW IMAGE USING OBJECT CLASSIFICATION BASED ON BAYESIAN LEARNING
    Jung, Jae-Il
    Ho, Yo-Sung
    2010 3DTV-CONFERENCE: THE TRUE VISION - CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO (3DTV-CON 2010), 2010,
  • [38] Learning Single-View 3D Reconstruction with Limited Pose Supervision
    Yang, Guandao
    Cui, Yin
    Belongie, Serge
    Hariharan, Bharath
    COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 90 - 105
  • [39] Exercise quantification from single camera view markerless 3D pose estimation
    Mercadal-Baudart, Clara
    Liu, Chao-Jung
    Farrell, Garreth
    Boyne, Molly
    Escribano, Jorge Gonzalez
    Smolic, Aljosa
    Simms, Ciaran
    HELIYON, 2024, 10 (06)
  • [40] SamPose: Generalizable Model-Free 6D Object Pose Estimation via Single-View Prompt
    Shi, Wubin
    Gai, Shaoyan
    Da, Feipeng
    Cai, Zeyu
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (05): : 4420 - 4427