Instance-Aware Monocular 3D Semantic Scene Completion

被引:0
|
作者
Xiao, Haihong [1 ]
Xu, Hongbin [1 ]
Kang, Wenxiong [1 ]
Li, Yuqiong [2 ]
机构
[1] South China Univ Technol, Sch Automat Sci & Engn, Guangzhou 511442, Peoples R China
[2] Chinese Acad Sci, Inst Mech, Key Lab Mech Fluid Solid Coupling Syst, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
3D scene understanding; semantic scene completion; 3D vision;
D O I
10.1109/TITS.2023.3344806
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
We study outdoor 3D scene understanding, a challenging task demanding the intelligent system to infer both geometry and semantics from a single-view image - a critical skill for autonomous vehicles to navigate in the real 3D world. Towards this end, we present an instance-aware monocular semantic scene completion framework. To the best of our knowledge, this is the first endeavor specifically targeting the challenge of instance perception in the camera-based semantic scene completion task. Our method consists of two stages. In stage I, we design a region-based VQ-VAE network, providing an effective solution for 3D occupancy prediction. In stage II, we first introduce an instance-aware attention module, explicitly incorporating instance-level cues captured from mask images to enhance the instance features in RGB images. Then we leverage the deformable cross-attention to aggregate image features corresponding to each voxel query and utilize the deformable self-attention to refine query proposals. We combine these key ingredients and evaluate our method on two challenging datasets, namely SemanticKITTI and SSCBench-KITTI-360. The results unequivocally demonstrate the superiority of our proposed method over the state-of-the-art VoxFormer-S. Specifically, our method surpasses VoxFormer-S by 0.22 IoU and 0.72 mIoU on the validation set and achieves an impressive improvement of 3.04 IoU and 1.06 mIoU on the SSCBench-KITTI-360 validation set. Meanwhile, our approach ensures accurate perception of critical instances, thereby exhibiting its exceptional performance and potential for practical deployment.
引用
收藏
页码:6543 / 6554
页数:12
相关论文
共 50 条
  • [1] MonoScene: Monocular 3D Semantic Scene Completion
    Anh-Quan Cao
    de Charette, Raoul
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3981 - 3991
  • [2] Volumetric Instance-Aware Semantic Mapping and 3D Object Discovery
    Grinvald, Margarita
    Furrer, Fadri
    Novkovic, Tonci
    Chung, Jen Jen
    Cadena, Cesar
    Siegwart, Roland
    Nieto, Juan
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (03) : 3037 - 3044
  • [3] INSTANCE-AWARE SIMPLIFICATION OF 3D POLYGONAL MESHES
    Azim, Tahir
    Cheslack-Postava, Ewen
    Levis, Philip
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2015,
  • [4] Instance-Aware Scene Layout Forecasting
    Qiao, Xiaotian
    Zheng, Quanlong
    Cao, Ying
    Lau, Rynson W. H.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (02) : 504 - 516
  • [5] Instance-Aware Scene Layout Forecasting
    Xiaotian Qiao
    Quanlong Zheng
    Ying Cao
    Rynson W. H. Lau
    [J]. International Journal of Computer Vision, 2022, 130 : 504 - 516
  • [6] MRFTrans: Multimodal Representation Fusion Transformer for monocular 3D semantic scene completion
    Xu, Rongtao
    Zhang, Jiguang
    Sun, Jiaxi
    Wang, Changwei
    Wu, Yifan
    Xu, Shibiao
    Meng, Weiliang
    Zhang, Xiaopeng
    [J]. INFORMATION FUSION, 2024, 111
  • [7] Semantic Point Completion Network for 3D Semantic Scene Completion
    Zhong, Min
    Zeng, Gang
    [J]. ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2824 - 2831
  • [8] 3D Semantic Scene Completion: A Survey
    Luis Roldão
    Raoul de Charette
    Anne Verroust-Blondet
    [J]. International Journal of Computer Vision, 2022, 130 : 1978 - 2005
  • [9] 3D Semantic Scene Completion: A Survey
    Roldao, Luis
    de Charette, Raoul
    Verroust-Blondet, Anne
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (08) : 1978 - 2005
  • [10] Correction to: Instance-Aware Scene Layout Forecasting
    Xiaotian Qiao
    Quanlong Zheng
    Ying Cao
    Rynson W. H. Lau
    [J]. International Journal of Computer Vision, 2022, 130 (3) : 883 - 883