Multi-view PointNet for 3D Scene Understanding

被引:77
|
作者
Jaritz, Maximilian [1 ]
Gu, Jiayuan [2 ]
Su, Hao [2 ]
机构
[1] INRIA, Valeo, Rocquencourt, France
[2] Univ Calif San Diego, San Diego, CA USA
关键词
D O I
10.1109/ICCVW.2019.00494
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fusion of 2D images and 3D point clouds is important because information from dense images can enhance sparse point clouds. However, fusion is challenging because 2D and 3D data live in different spaces. In this work, we propose MVPNet (Multi-View PointNet), where we aggregate 2D multi-view image features into 3D point clouds, and then use a point based network to fuse the features in 3D canonical space to predict 3D semantic labels. To this end, we introduce view selection along with a 2D-3D feature aggregation module. Extensive experiments show the benefit of leveraging features from dense images and reveal superior robustness to varying point cloud density compared to 3D-only methods. On the ScanNetV2 [4] benchmark, our MVPNet significantly outperforms prior point cloud based approaches on the task of 3D Semantic Segmentation. It is much faster to train than the large networks of the sparse voxel approach [6]. We provide solid ablation studies to ease the future design of 2D-3D fusion methods and their extension to other tasks, as we showcase for 3D instance segmentation.
引用
收藏
页码:3995 / 4003
页数:9
相关论文
共 50 条
  • [1] Sequential Fusion of Multi-view Video Frames for 3D Scene Generation
    Sun, Weilin
    Li, Xiangxian
    Li, Manyi
    Wang, Yuqing
    Zheng, Yuze
    Meng, Xiangxu
    Meng, Lei
    [J]. ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 597 - 608
  • [2] RGB-D Multi-View System Calibration for Full 3D Scene Reconstruction
    Afzal, Hassan
    Aouada, Djamila
    Fofi, David
    Mirbach, Bruno
    Ottersten, Bjoern
    [J]. 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 2459 - 2464
  • [3] JOINT MULTI-VIEW PEOPLE TRACKING AND POSE ESTIMATION FOR 3D SCENE RECONSTRUCTION
    Tang, Zheng
    Gu, Renshu
    Hwang, Jenq-Neng
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [4] A Hybrid Multi-View 3D Reconstruction Method Based on Scene Graph Partition
    Xue, Jun-Shi
    Yi, Hui
    Wu, Zhi-Huan
    Chen, Xiang-Ning
    [J]. Zidonghua Xuebao/Acta Automatica Sinica, 2020, 46 (04): : 782 - 795
  • [5] A Tensor Voting Approach for Multi-view 3D Scene Flow Estimation and Refinement
    Park, Jaesik
    Oh, Tae Hyun
    Jung, Jiyoung
    Tai, Yu-Wing
    Kweon, In So
    [J]. COMPUTER VISION - ECCV 2012, PT IV, 2012, 7575 : 288 - 302
  • [6] Multi-view 3D scene reconstruction using ant colony optimization techniques
    Chrysostomou, Dimitrios
    Gasteratos, Antonios
    Nalpantidis, Lazaros
    Sirakoulis, Georgios C.
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2012, 23 (11)
  • [7] 3D Reconstruction for Multi-view Objects
    Yu, Jun
    Yin, Wenbin
    Hu, Zhiyi
    Liu, Yabin
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2023, 106
  • [8] Multi-view 3D Reconstruction with Transformers
    Wang, Dan
    Cui, Xinrui
    Chen, Xun
    Zou, Zhengxia
    Shi, Tianyang
    Salcudean, Septimiu
    Wang, Z. Jane
    Ward, Rabab
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 5702 - 5711
  • [9] Multi-View Image Capture for Glasses Free Multi-View 3D Displays
    Gurbuz, Sabri
    Yano, Sumio
    Iwasawa, Shoichiro
    Ando, Hiroshi
    [J]. IDW'10: PROCEEDINGS OF THE 17TH INTERNATIONAL DISPLAY WORKSHOPS, VOLS 1-3, 2010, : 2091 - 2094
  • [10] 3D Semantic Scene Segmentation with Multi-View RGB-D Images in Indoor Environments
    Bae, Hye-Lim
    Kim, Incheol
    [J]. Journal of Institute of Control, Robotics and Systems, 2023, 29 (03): : 235 - 244