OnionNet: Single-View Depth Prediction and Camera Pose Estimation for Unlabeled Video

被引:4
|
作者
Gu, Tianhao [1 ,2 ]
Wang, Zhe [1 ,2 ]
Li, Dongdong [2 ]
Yang, Hai [2 ]
Du, Wenli [1 ]
Zhou, Yangming [2 ]
机构
[1] East China Univ Sci & Technol, Key Lab Adv Control & Optimizat Chem Proc, Minist Educ, Shanghai 200237, Peoples R China
[2] East China Univ Sci & Technol, Dept Comp Sci & Engn, Shanghai 200237, Peoples R China
基金
美国国家科学基金会;
关键词
Cameras; Training; Pose estimation; Geometry; Robustness; Task analysis; Decoding; Camera pose estimation; multitask learning; single-view depth prediction; unsupervised learning; LOCALIZATION; SLAM;
D O I
10.1109/TCDS.2020.3042521
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real scenes, humans can easily infer their positions and distances from other objects with their own eyes. To make the robots have the same visual ability, this article presents an unsupervised OnionNet framework, including LeafNet and ParachuteNet, for single-view depth prediction and camera pose estimation. In OnionNet, for speeding up OnionNet's convergence and concretizing objects against the gradient locality and moving objects in videos, LeafNet adopts two decoders and enhanced upconvolution modules. Meanwhile, for improving the robustness of fast camera movement and rotation, ParachuteNet uses and integrates three pose networks to estimate multiview camera pose parameters by combining with the modified image preprocess. Different from existing methods, single-view depth prediction and camera pose estimation are trained view by view, where the variations between views is gradual reduction of view range and outer pixels disappear in next view, similar to onion peeling. Moreover, the LeafNet is optimized with pose parameter from each pose network in turn. Experimental results on the KITTI data set show the outstanding effectiveness of our method: single-view depth performs better than most supervised and unsupervised methods which contain two same subtasks, and pose estimation gets the state-of-the-art performance compared with existing methods under the comparable input settings.
引用
收藏
页码:995 / 1009
页数:15
相关论文
共 50 条
  • [21] IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction
    Jack, Dominic
    Maire, Frederic
    Shirazi, Sareh
    Eriksson, Anders
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7068 - 7077
  • [22] GISR: Geometric Initialization and Silhouette-Based Refinement for Single-View Robot Pose and Configuration Estimation
    Bilic, Ivan
    Maric, Filip
    Bonsignorio, Fabio
    Petrovic, Ivan
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 9883 - 9890
  • [23] DigiDogs: Single-View 3D Pose Estimation of Dogs using Synthetic Training Data
    Shooter, Moira
    Malleson, Charles
    Hilton, Adrian
    2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 92 - 101
  • [24] Pose Estimation of a Depth Camera Using Plane Features
    Rhee, Seon-Min
    Lee, Yong-Beom
    Kim, James D. K.
    Rhee, Taehyun
    2013 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2013, : 133 - +
  • [25] 3D Object Retrieval and Pose Estimation for a Single-view Query Image in a Mobile Environment
    Tak, Yoon-Sik
    Hwang, Eenjun
    PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCES ON ADVANCES IN MULTIMEDIA (MMEDIA 2011), 2011, : 62 - 67
  • [26] Single-View Depth Estimation: Advancing 3D Scene Interpretation With One Lens
    Dhanushkodi, Kavitha
    Bala, Akila
    Chaplot, Neelam
    IEEE ACCESS, 2025, 13 : 20562 - 20573
  • [27] MOTSLAM: MOT-assisted monocular dynamic SLAM using single-view depth estimation
    Zhang, Hanwei
    Uchiyama, Hideaki
    Ono, Shintaro
    Kawasaki, Hiroshi
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 4865 - 4872
  • [28] Single-view calibration of a catadioptric camera based on a theodolite model
    Wang, Yuxuan
    Lv, Yaowen
    Xu, Xiping
    Gong, Xuanrui
    Yu, Ziwen
    Geng, Jiaxing
    APPLIED OPTICS, 2022, 61 (09) : 2256 - 2266
  • [29] Global depth estimation for multi-view video coding using camera parameters
    Zhang, Xiaoyun
    Zhu, Weile
    Yang, George
    VISAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2008, : 631 - +
  • [30] SINGLE VIEW HEAD POSE ESTIMATION
    Martins, Pedro
    Batista, Jorge
    2008 15TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-5, 2008, : 1652 - 1655