TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers

被引:70
|
作者
Ding, Yikang [1 ,2 ]
Yuan, Wentao [1 ,3 ]
Zhu, Qingtian [1 ]
Zhang, Haotian [1 ]
Liu, Xiangyue [1 ]
Wang, Yuanjiang [1 ]
Liu, Xiao [1 ]
机构
[1] Megvii Res, Beijing, Peoples R China
[2] Tsinghua Univ, Beijing, Peoples R China
[3] Peking Univ, Beijing, Peoples R China
关键词
D O I
10.1109/CVPR52688.2022.00839
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present TransMVSNet, based on our exploration of feature matching in multi-view stereo (MVS). We analogize MVS back to its nature of a feature matching task and therefore propose a powerful Feature Matching Transformer (FMT) to leverage intra- (self-) and inter(cross-) attention to aggregate long-range context information within and across images. To facilitate a better adaptation of the FMT, we leverage an Adaptive Receptive Field (ARF) module to ensure a smooth transit in scopes of features and bridge different stages with a feature pathway to pass transformed features and gradients across different scales. In addition, we apply pair-wise feature correlation to measure similarity between features, and adopt ambiguity-reducing focal loss to strengthen the supervision. To the best of our knowledge, TransMVSNet is the first attempt to leverage Transformer into the task of MVS. As a result, our method achieves state-of-the-art performance on DTU dataset, Tanks and Temples benchmark, and BlendedMVS dataset. Code is available at https://github.com/MegviiRobot/TransMVSNet.
引用
收藏
页码:8575 / 8584
页数:10
相关论文
共 50 条
  • [1] Multi-view learning for context-aware extractive summarization
    Yang, Zhenyu
    Yang, Jie
    Yecies, Brian
    Li, Wanqing
    [J]. 2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 1762 - 1769
  • [2] Context-Aware Multi-View Summarization Network for Image-Text Matching
    Qu, Leigang
    Liu, Meng
    Cao, Da
    Nie, Liqiang
    Tian, Qi
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1047 - 1055
  • [3] Context-Aware Multi-View Attention Networks for Emotion Cause Extraction
    Xiao, Xinglin
    Wei, Penghui
    Mao, Wenji
    Wang, Lei
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2019, : 128 - 133
  • [4] Shading-Aware Multi-view Stereo
    Langguth, Fabian
    Sunkavalli, Kalyan
    Hadap, Sunil
    Goesele, Michael
    [J]. COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 : 469 - 485
  • [5] Attention-Aware Multi-View Stereo
    Luo, Keyang
    Guan, Tao
    Ju, Lili
    Wang, Yuesong
    Chen, Zhuo
    Luo, Yawei
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1587 - 1596
  • [6] Identifying Extremism in Social Media with Multi-view Context-Aware Subset Optimization
    Das Bhattacharjee, Sreyasee
    Balantrapu, Bala Venkatram
    Tolone, William
    Talukder, Ashit
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 3638 - 3647
  • [7] Vis-MVSNet: Visibility-Aware Multi-view Stereo Network
    Zhang, Jingyang
    Li, Shiwei
    Luo, Zixin
    Fang, Tian
    Yao, Yao
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (01) : 199 - 214
  • [8] Vis-MVSNet: Visibility-Aware Multi-view Stereo Network
    Jingyang Zhang
    Shiwei Li
    Zixin Luo
    Tian Fang
    Yao Yao
    [J]. International Journal of Computer Vision, 2023, 131 : 199 - 214
  • [9] Visibility-Aware Point-Based Multi-View Stereo Network
    Chen, Rui
    Han, Songfang
    Xu, Jing
    Su, Hao
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (10) : 3695 - 3708
  • [10] Multi-scale inputs and context-aware aggregation network for stereo matching
    Shi, Liqing
    Xiong, Taiping
    Cui, Gengshen
    Pan, Minghua
    Cheng, Nuo
    Wu, Xiangjie
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (30) : 75171 - 75194