S2CNet: Semantic and Structure Completion Network for 3D Object Detection

被引:0
|
作者
Shi, Chao [1 ]
Zhang, Chongyang [1 ,2 ]
Luo, Yan [1 ]
Qian, Zefeng [1 ]
Zhao, Muming [3 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China
[3] Beijing Forestry Univ, Sch Informat Sci & Technol, Beijing 200240, Peoples R China
关键词
Feature extraction; Semantics; Proposals; Three-dimensional displays; Point cloud compression; Detectors; Object detection; 3D object detection; point cloud; feature completion; autonomous driving;
D O I
10.1109/TITS.2024.3429139
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
LiDAR has become one of the primary 3D object detection sensors in autonomous driving. However, due to the inherent sparsity of point clouds, certain objects exhibit structure incompleteness in occluded and distant areas, which hampers the accurate perception of objects in 3D space. To tackle this challenge, we propose Semantic and Structure Completion Network (S(2)CNet) for 3D object detection. Concretely, we design the Semantic Completion (SeC) module to generate semantic features in Bird's-Eye-View (BEV) space, utilizing a teacher-student paradigm. Notably, we adopt a coarse-to-fine guidance strategy to encourage student network to generate semantic features specifically within foreground regions. This ensures that the student network focuses on the generation of foreground object features. Besides, we introduce an attention-based module to adaptively fuse the generated features and raw features. SeC module faces particular limitation when dealing with objects containing only a few points, in such case, the network is prone to generating low quality proposals with inaccurate localization. Complementary to SeC module, we introduce the Structure Completion (StC) module, in which a group of structural proposals are obtained by traversing most structures in a structure-guided manner, and thus at least one proposal with ground truth similar structure can be guaranteed. Extensive experiments on the KITTI and nuScenes benchmarks demonstrate the effectiveness of our method, especially for the hard setting objects with fewer points.
引用
收藏
页码:17134 / 17146
页数:13
相关论文
共 50 条
  • [41] 3 x 2: 3D Object Part Segmentation by 2D Semantic Correspondences
    Anh Thai
    Wang, Weiyao
    Tang, Hao
    Stojanov, Stefan
    Rehg, James M.
    Feiszli, Matt
    COMPUTER VISION-ECCV 2024, PT XXXVIII, 2025, 15096 : 149 - 166
  • [42] Geometry-semantic aware for monocular 3D Semantic Scene Completion
    Lu, Zonghao
    Cao, Bing
    Xia, Shuyin
    Hu, Qinghua
    PATTERN RECOGNITION, 2025, 158
  • [43] Data Augmented 3D Semantic Scene Completion with 2D Segmentation Priors
    Dourado, Aloisio
    Guth, Frederico
    de Campos, Teofilo
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 687 - 696
  • [44] Anisotropic Convolutional Networks for 3D Semantic Scene Completion
    Li, Jie
    Han, Kai
    Wang, Peng
    Liu, Yu
    Yuan, Xia
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3348 - 3356
  • [45] SIANet: 3D object detection with structural information augment network
    Zhou, Jing
    Lin, Tengxing
    Gong, Zixin
    Huang, Xinhan
    IET COMPUTER VISION, 2024, 18 (05) : 682 - 695
  • [46] Depth-enhancement network for monocular 3D object detection
    Liu, Guohua
    Lian, Haiyang
    Guo, Changrui
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (09)
  • [47] Geometry Uncertainty Projection Network for Monocular 3D Object Detection
    Lu, Yan
    Ma, Xinzhu
    Yang, Lei
    Zhang, Tianzhu
    Liu, Yating
    Chu, Qi
    Yan, Junjie
    Ouyang, Wanli
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3091 - 3101
  • [48] Image attention transformer network for indoor 3D object detection
    Ren, Keyan
    Yan, Tong
    Hu, Zhaoxin
    Han, Honggui
    Zhang, Yunlu
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2024, 67 (07) : 2176 - 2190
  • [49] Image attention transformer network for indoor 3D object detection
    REN KeYan
    YAN Tong
    HU ZhaoXin
    HAN HongGui
    ZHANG YunLu
    Science China(Technological Sciences), 2024, (07) : 2176 - 2190
  • [50] ARPNET: attention region proposal network for 3D object detection
    Yangyang Ye
    Chi Zhang
    Xiaoli Hao
    Science China Information Sciences, 2019, 62