Full Surround Monodepth From Multiple Cameras

被引:13
|
作者
Guizilini, Vitor [1 ]
Vasiljevic, Igor [2 ]
Ambrus, Rares [1 ]
Shakhnarovich, Greg [2 ]
Gaidon, Adrien [1 ]
机构
[1] Toyota Res Inst TRI, Los Altos, CA 95051 USA
[2] Toyota Technol Inst Chicago, Chicago, IL 60194 USA
关键词
Computer vision; machine learning; autonomous automobiles;
D O I
10.1109/LRA.2022.3150884
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Self-supervised monocular depth and ego-motion estimation is a promising approach to replace or supplement expensive depth sensors such as LiDAR for robotics applications like autonomous driving. However, most research in this area focuses on a single monocular camera or stereo pairs that cover only a fraction of the scene around the vehicle. In this work, we extend monocular self-supervised depth and ego-motion estimation to large-baseline multi-camera rigs. Using generalized spatio-temporal contexts, pose consistency constraints, and carefully designed photometric loss masking, we learn a single network generating dense, consistent, and scale-aware point clouds that cover the same full surround 360 degrees field of view as a typical LiDAR scanner. We also propose a new scale-consistent evaluation metric more suitable to multicamera settings. Experiments on two challenging benchmarks illustrate the benefits of our approach over strong baselines.
引用
收藏
页码:5397 / 5404
页数:8
相关论文
共 50 条
  • [1] Full sail surround
    不详
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 1998, 46 (03): : 251 - 251
  • [2] No Blind Spots: Full-Surround Multi-Object Tracking for Autonomous Vehicles Using Cameras and LiDARs
    Rangesh, Akshay
    Trivedi, Mohan Manubhai
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2019, 4 (04): : 588 - 599
  • [3] Full-Motion Recovery from Multiple Video Cameras Applied to Face Tracking and Recognition
    Harguess, Josh
    Hu, Changbo
    Aggarwal, J. K.
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [4] Counting people from multiple cameras
    Kettnaker, V
    Zabih, R
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 2, 1999, : 267 - 271
  • [5] Visual integration from multiple cameras
    Yang, ZH
    Bobick, A
    WACV 2005: SEVENTH IEEE WORKSHOP ON APPLICATIONS OF COMPUTER VISION, PROCEEDINGS, 2005, : 488 - 493
  • [6] Counting people from multiple cameras
    Kettnaker, Vera
    Zabih, Ramin
    International Conference on Multimedia Computing and Systems -Proceedings, 1999, 2 : 267 - 271
  • [7] Scene reconstruction from multiple cameras
    Szeliski, R
    2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2000, : 13 - 16
  • [8] Aligning sequences from multiple cameras
    Korah, T
    Rasmussen, C
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 941 - 944
  • [9] SoilingNet: Soiling Detection on Automotive Surround-View Cameras
    Uricar, Michal
    Krizek, Pavel
    Sistu, Ganesh
    Yogamani, Senthil
    2019 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2019, : 67 - 72
  • [10] Surround structured lighting for full object scanning
    Lanman, Douglas
    Crispell, Daniel
    Taubin, Gabriel
    3DIM 2007: SIXTH INTERNATIONAL CONFERENCE ON 3-D DIGITAL IMAGING AND MODELING, PROCEEDINGS, 2007, : 107 - +