From Front to Rear: 3D Semantic Scene Completion Through Planar Convolution and Attention-Based Network

被引:2
|
作者
Li, Jie [1 ,2 ]
Song, Qi [1 ]
Yan, Xiaohu [2 ]
Chen, Yongquan [3 ]
Huang, Rui [1 ]
机构
[1] Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Guangdong, Peoples R China
[2] Shenzhen Polytech, Sch Artificial Intelligence, Shenzhen 518055, Guangdong, Peoples R China
[3] Chinese Univ Hong Kong, Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen 518172, Guangdong, Peoples R China
关键词
Semantics; Three-dimensional displays; Convolution; Feature extraction; Semantic segmentation; Task analysis; Surface treatment; Semantic scene completion; planar convolution; planar attention; context perception; RGB-D fusion; FUSION NETWORK;
D O I
10.1109/TMM.2023.3234441
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semantic Scene Completion (SSC) aims to reconstruct complete 3D scenes with precise voxel-wise semantics from the single-view incomplete input data, a crucial but highly challenging problem for scene understanding. Although SSC has seen significant progress due to the introduction of 2D semantic priors in recent years, the occluded parts, especially the rear-view of the scenes, are still poorly completed and segmented. To ameliorate this issue, we propose a novel deep learning framework for 3D SSC, named Planar Convolution and Attention-based Network (PCANet), to effectively extend high-precision predictions of the front-view surface to the rear-view occluded areas. Specifically, we decompose the traditional convolutional layer into three successive planar convolutions to form a Planar Convolution Residual (PCR) block, which maintains the planar features of the 3D scene. Afterward, the Planar Attention Module (PAM) is proposed to capture three different planar attentions and harvest the global context from the front surface to the rear occluded areas to improve the overall accuracy. Extensive experiments on the real NYU and NYUCAD datasets and the synthetic SUNCG-RGBD dataset demonstrate that our proposed framework can generate high-quality SSC results in both front and rear views and outperforms the state-of-the-art approaches trained in an end-to-end manner without additional data.
引用
收藏
页码:8294 / 8307
页数:14
相关论文
共 50 条
  • [1] Semantic Point Completion Network for 3D Semantic Scene Completion
    Zhong, Min
    Zeng, Gang
    [J]. ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2824 - 2831
  • [2] Attention-Based Multi-Modal Fusion Network for Semantic Scene Completion
    Li, Siqi
    Zou, Changqing
    Li, Yipeng
    Zhao, Xibin
    Gao, Yue
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11402 - 11409
  • [3] RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion
    Li, Jie
    Liu, Yu
    Gong, Dong
    Shi, Qinfeng
    Yuan, Xia
    Zhao, Chunxia
    Reid, Ian
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7685 - 7694
  • [4] 3D Semantic Scene Completion: A Survey
    Luis Roldão
    Raoul de Charette
    Anne Verroust-Blondet
    [J]. International Journal of Computer Vision, 2022, 130 : 1978 - 2005
  • [5] 3D Semantic Scene Completion: A Survey
    Roldao, Luis
    de Charette, Raoul
    Verroust-Blondet, Anne
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (08) : 1978 - 2005
  • [6] MonoScene: Monocular 3D Semantic Scene Completion
    Anh-Quan Cao
    de Charette, Raoul
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3981 - 3991
  • [7] Two Stream 3D Semantic Scene Completion
    Garbade, Martin
    Chen, Yueh-Tung
    Sawatzky, Johann
    Gall, Juergen
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 416 - 425
  • [8] Object-Aware Semantic Scene Completion Through Attention-Based Feature Fusion and Voxel-Points Representation
    Miao, Yubin
    Wan, Junkang
    Luo, Junjie
    Wu, Hang
    Fu, Ruochong
    [J]. IEEE ACCESS, 2024, 12 : 31431 - 31442
  • [9] Anisotropic Convolutional Networks for 3D Semantic Scene Completion
    Li, Jie
    Han, Kai
    Wang, Peng
    Liu, Yu
    Yuan, Xia
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3348 - 3356
  • [10] Resolution-switchable 3D Semantic Scene Completion
    Luo, Shoutong
    Sun, Zhengxing
    Sun, Yunhan
    Wang, Yi
    [J]. COMPUTER GRAPHICS FORUM, 2022, 41 (07) : 121 - 130