DeepVideoMVS: Multi-View Stereo on Video with Recurrent Spatio-Temporal Fusion

被引:35
|
作者
Duzceker, Arda [1 ]
Galliani, Silvano [2 ]
Vogel, Christoph [2 ]
Speciale, Pablo [2 ]
Dusmanu, Mihai [1 ]
Pollefeys, Marc [1 ,2 ]
机构
[1] Swiss Fed Inst Technol, Dept Comp Sci, Zurich, Switzerland
[2] Microsoft Mixed Real & AI Zurich Lab, Zurich, Switzerland
关键词
D O I
10.1109/CVPR46437.2021.01507
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose an online multi-view depth prediction approach on posed video streams, where the scene geometry information computed in the previous time steps is propagated to the current time step in an efficient and geometrically plausible way. The backbone of our approach is a real-time capable, lightweight encoder-decoder that relies on cost volumes computed from pairs of images. We extend it by placing a ConvLSTM cell at the bottleneck layer, which compresses an arbitrary amount of past information in its states. The novelty lies in propagating the hidden state of the cell by accounting for the viewpoint changes between time steps. At a given time step, we warp the previous hidden state into the current camera plane using the previous depth prediction. Our extension brings only a small overhead of computation time and memory consumption, while improving the depth predictions significantly. As a result, we outperform the existing state-of-the-art multi-view stereo methods on most of the evaluated metrics in hundreds of indoor scenes while maintaining a real-time performance. Code available: https://github.com/ardaduz/deep-video-mvs
引用
收藏
页码:15319 / 15328
页数:10
相关论文
共 50 条
  • [1] Fusion Side Information Based on Spatio-temporal Correlation in Distributed Multi-view Video Coding
    Huang, Qing
    Li, Bin
    Wang, Yumei
    Zhang, Lin
    Liu, Yu
    [J]. 2011 6TH INTERNATIONAL ICST CONFERENCE ON COMMUNICATIONS AND NETWORKING IN CHINA (CHINACOM), 2011, : 76 - 81
  • [2] SPATIO-TEMPORAL CONSISTENT DEPTH MAPS FROM MULTI-VIEW VIDEO
    Mueller, Marcus
    Zilly, Frederik
    Riechert, Christian
    Kauff, Peter
    [J]. 2011 3DTV CONFERENCE: THE TRUE VISION - CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO (3DTV-CON), 2011,
  • [3] Multi-View Stereo by Temporal Nonparametric Fusion
    Hou, Yuxin
    Kannala, Juho
    Solin, Arno
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2651 - 2660
  • [4] Fast color correction for multi-view video by modeling spatio-temporal variation
    Shao, Feng
    Jiang, Gang-Yi
    Yu, Mei
    Ho, Yo-Sung
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2010, 21 (5-6) : 392 - 403
  • [5] Efficient Multi-Stage Video Denoising with Recurrent Spatio-Temporal Fusion
    Maggioni, Matteo
    Huang, Yibin
    Li, Cheng
    Xiao, Shuai
    Fu, Zhongqian
    Song, Fenglong
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3465 - 3474
  • [6] Dense and Accurate Spatio-temporal Multi-view Stereovision
    Courchay, Jerome
    Pons, Jean-Philippe
    Monasse, Pascal
    Keriven, Renaud
    [J]. COMPUTER VISION - ACCV 2009, PT II, 2010, 5995 : 11 - +
  • [7] SPATIO-TEMPORAL MULTI-VIEW SYNTHESIS FOR FREE VIEWPOINT TELEVISION
    Kumar, Katta Phani
    Gupta, Sumana
    Venkatesh, K. S.
    [J]. 2013 3DTV-CONFERENCE: THE TRUE VISION-CAPTURE, TRANSMISSION AND DISPALY OF 3D VIDEO (3DTV-CON), 2013,
  • [8] A Novel Multi-view Similarity for Clustering Spatio-Temporal Data
    Velpula, Vijaya Bhaskar
    Prasad, M. H. M. Krishna
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 1, 2016, 379 : 299 - 307
  • [9] Efficient railway kilometer marker recognition via spatio-temporal slimming and multi-view fusion
    Xian, Xiaoyu
    Guo, Xiaoyu
    Tian, Yin
    Wei, Xiang
    Tian, Daxin
    [J]. COMPUTER COMMUNICATIONS, 2024, 222 : 26 - 37
  • [10] Spatio-temporal classification at multiple resolutions using multi-view regularization
    Nayak, Guruprasad
    Ghosh, Rahul
    Jia, Xiaowei
    Mithal, Varun
    Kumar, Vipin
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 4117 - 4120