SA-MVSNet: Self-attention-based multi-view stereo network for 3D reconstruction of images with weak texture

被引:3
|
作者
Yang, Ronghao [1 ]
Miao, Wang [1 ]
Zhang, Zhenxin [2 ,3 ]
Liu, Zhenlong [1 ]
Li, Mubai [2 ,3 ]
Lin, Bin [1 ]
机构
[1] Chengdu Univ Technol, Coll Earth Sci, Chengdu 610059, Sichuan, Peoples R China
[2] Capital Normal Univ, Key Lab 3D Informat Acquisit & Applicat, MOE, Beijing 100048, Peoples R China
[3] Capital Normal Univ, Coll Resource Environm & Tourism, Beijing 100048, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Multi-view stereo; Depth estimation; Self-attention; Transformer; Weak texture; Adaptive propagation;
D O I
10.1016/j.engappai.2023.107800
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-view stereo (MVS) reconstruction is a key task of image-based 3D reconstruction, and deep learning-based methods can achieve better results than traditional algorithms. However, most of the current deep learning-based MVS methods use convolutional neural networks (CNNs) to extract image features, which cannot achieve the aggregation of long-distance context information and capture robust global information. In addition, in the process of fusing depth maps into point clouds, the confidence filters will filter out the depth values with low confidence in weak texture areas. These problems will lead to the low completeness of 3D reconstruction of weak texture and texture-less areas. To address the above problems, this paper proposes SA-MVSNet based on the PatchmatchNet with a self-attentive mechanism. First, we design a coarse-to-fine network framework to advance depth map estimation. In the feature extraction network, a module with a pyramid structure based on Swin Transformer Block is used to replace the original Feature Pyramid Network (FPN), and the self-correlation between weak texture areas is enhanced by applying a global self-attention mechanism. Then, we also propose a self-attention-based adaptive propagation module (SA-AP), which applies a self-attention calculation within depth value propagation window to obtain the relative weight values of current pixel and others, and then adaptively samples the depth values of neighbors on the same surface for propagation. Experiments show that SA-MVSNet has significantly improved the completeness of 3D reconstruction for the images with weak texture on DTU (provided by Danish Technical University), BlendedMVS, and Tanks and Temple datasets. Our code is available at https://github.com/miaowang525/SA-MVSNet.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] HighRes-MVSNet: A Fast Multi-View Stereo Network for Dense 3D Reconstruction From High-Resolution Images
    Weilharter, Rafael
    Fraundorfer, Friedrich
    IEEE ACCESS, 2021, 9 : 11306 - 11315
  • [2] Fast and Accurate 3D Reconstruction of Plants Using MVSNet and Multi-View Images
    Chen, Zhen
    Lv, Hui
    Lou, Lu
    Doonan, John H.
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, 2022, 1409 : 390 - 399
  • [3] Attention aware cost volume pyramid based multi-view stereo network for 3D reconstruction
    Yu, Anzhu
    Guo, Wenyue
    Liu, Bing
    Chen, Xin
    Wang, Xin
    Cao, Xuefeng
    Jiang, Bingchuan
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 175 : 448 - 460
  • [4] Underwater 3D reconstruction based on multi-view stereo
    Gu, Feifei
    Zhao, Juan
    Xu, Pei
    Huang, Shulan
    Zhang, Gaopeng
    Song, Zhan
    OCEAN OPTICS AND INFORMATION TECHNOLOGY, 2018, 10850
  • [5] Multi-view 3D Reconstruction with Self-attention
    Qian, Qiuting
    2021 14TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2021), 2021, : 20 - 26
  • [6] An attention-based and deep sparse priori cascade multi-view stereo network for 3D reconstruction
    Wang, Yadong
    Ran, Teng
    Liang, Yuan
    Zheng, Guoquan
    COMPUTERS & GRAPHICS-UK, 2023, 116 : 383 - 392
  • [7] 360MVSNet: Deep Multi-view Stereo Network with 360° Images for Indoor Scene Reconstruction
    Chiu, Ching-Ya
    Wu, Yu-Ting
    Shen, I-Chao
    Chuang, Yung-Yu
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 3056 - 3065
  • [8] An Extension of PatchMatch Stereo for 3D Reconstruction from Multi-View Images
    Hiradate, Mutsuki
    Ito, Koichi
    Aoki, Takafumi
    Watanabe, Takafumi
    Unten, Hiroki
    PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, 2015, : 61 - 65
  • [9] Multi-View Stereo 3D Edge Reconstruction
    Bignoli, Andrea
    Romanoni, Andrea
    Matteucci, Matteo
    2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 867 - 875
  • [10] DAR-MVSNet: a novel dual attention residual network for multi-view stereo
    Li, Tingshuai
    Liang, Hu
    Wen, Changchun
    Qu, Jiacheng
    Zhao, Shengrong
    Zhang, Qingmeng
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 5857 - 5866