MVSNet: Depth Inference for Unstructured Multi-view Stereo

被引:631
|
作者
Yao, Yao [1 ]
Luo, Zixin [1 ]
Li, Shiwei [1 ]
Fang, Tian [2 ]
Quan, Long [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Shenzhen Zhuke Innovat Technol Altizure, Shenzhen, Peoples R China
来源
关键词
Multi-view stereo; Depth map; Deep learning; RECONSTRUCTION;
D O I
10.1007/978-3-030-01237-3_47
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an end-to-end deep learning architecture for depth map inference from multi-view images. In the network, we first extract deep visual image features, and then build the 3D cost volume upon the reference camera frustum via the differentiable homography warping. Next, we apply 3D convolutions to regularize and regress the initial depth map, which is then refined with the reference image to generate the final output. Our framework flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature. The proposed MVSNet is demonstrated on the large-scale indoor DTU dataset. With simple post-processing, our method not only significantly outperforms previous state-of-the-arts, but also is several times faster in runtime. We also evaluate MVSNet on the complex outdoor Tanks and Temples dataset, where our method ranks first before April 18, 2018 without any fine-tuning, showing the strong generalization ability of MVSNet.
引用
收藏
页码:785 / 801
页数:17
相关论文
共 50 条
  • [21] LS-MVSNet: Lightweight self-supervised multi-view stereo
    Liu, Houxuan
    Han, Xiao
    Yang, Lu
    [J]. COMPUTERS & GRAPHICS-UK, 2023, 117 : 183 - 191
  • [22] Vis-MVSNet: Visibility-Aware Multi-view Stereo Network
    Zhang, Jingyang
    Li, Shiwei
    Luo, Zixin
    Fang, Tian
    Yao, Yao
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (01) : 199 - 214
  • [23] WT-MVSNet: Window-based Transformers for Multi-view Stereo
    Liao, Jinli
    Ding, Yikang
    Shavit, Yoli
    Huang, Dihe
    Ren, Shihao
    Guo, Jia
    Feng, Wensen
    Zhang, Kai
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [24] Vis-MVSNet: Visibility-Aware Multi-view Stereo Network
    Jingyang Zhang
    Shiwei Li
    Zixin Luo
    Tian Fang
    Yao Yao
    [J]. International Journal of Computer Vision, 2023, 131 : 199 - 214
  • [25] Continuous Depth Estimation for Multi-view Stereo
    Liu, Yebin
    Cao, Xun
    Dai, Qionghai
    Xu, Wenli
    [J]. CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 2121 - 2128
  • [26] OD-MVSNet: Omni-dimensional dynamic multi-view stereo network
    Pan, Ke
    Li, Kefeng
    Zhang, Guangyuan
    Zhu, Zhenfang
    Wang, Peng
    Wang, Zhenfei
    Fu, Chen
    Li, Guangchen
    Ding, Yuxuan
    [J]. PLOS ONE, 2024, 19 (08):
  • [27] DAR-MVSNet: a novel dual attention residual network for multi-view stereo
    Li, Tingshuai
    Liang, Hu
    Wen, Changchun
    Qu, Jiacheng
    Zhao, Shengrong
    Zhang, Qingmeng
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 5857 - 5866
  • [28] MG-MVSNet: Multiple granularities feature fusion network for multi-view stereo
    Zhang, Xuedian
    Yang, Fanzhou
    Chang, Min
    Qin, Xiaofei
    [J]. NEUROCOMPUTING, 2023, 528 : 35 - 47
  • [29] EA-MVSNet: Learning Error-Awareness for Enhanced Multi-View Stereo
    Gu, Wencong
    Xiao, Haihong
    Zhao, Xueyan
    Kang, Wenxiong
    [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (12) : 12127 - 12141
  • [30] PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention
    Zhang, Ke
    Liu, Mengyu
    Zhang, Jinlai
    Dong, Zhenbiao
    [J]. IEEE ACCESS, 2021, 9 : 27908 - 27915