SOGDet: Semantic-Occupancy Guided Multi-View 3D Object Detection

被引:0
|
作者
Zhou, Qiu
Cao, Jinming [1 ]
Leng, Hanchao [2 ]
Yin, Yifang [3 ]
Kun, Yu [2 ]
Zimmermann, Roger [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Xiaomi Car, Singapore, Singapore
[3] ASTAR, I2R, Singapore, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the field of autonomous driving, accurate and comprehensive perception of the 3D environment is crucial. Bird's Eye View (BEV) based methods have emerged as a promising solution for 3D object detection using multi-view images as input. However, existing 3D object detection methods often ignore the physical context in the environment, such as sidewalk and vegetation, resulting in sub-optimal performance. In this paper, we propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection), that leverages a 3D semantic-occupancy branch to improve the accuracy of 3D object detection. In particular, the physical context modeled by semantic occupancy helps the detector to perceive the scenes in a more holistic view. Our SOGDet is flexible to use and can be seamlessly integrated with most existing BEV-based methods. To evaluate its effectiveness, we apply this approach to several state-of-the-art baselines and conduct extensive experiments on the exclusive nuScenes dataset. Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP). This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems. The codes are available at: https://github.com/zhouqiu/SOGDet.
引用
收藏
页码:7668 / 7676
页数:9
相关论文
共 50 条
  • [41] Multi-view dual attention network for 3D object recognition
    Wenju Wang
    Yu Cai
    Tao Wang
    [J]. Neural Computing and Applications, 2022, 34 : 3201 - 3212
  • [42] Deep models for multi-view 3D object recognition: a review
    Alzahrani, Mona
    Usman, Muhammad
    Jarraya, Salma Kammoun
    Anwar, Saeed
    Helmy, Tarek
    [J]. Artificial Intelligence Review, 2024, 57 (12)
  • [43] Multi-view and multivariate gaussian descriptor for 3D object retrieval
    Gao, Zan
    Xue, Kai-Xin
    Zhang, Hua
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (01) : 555 - 572
  • [44] Multi-view and multivariate gaussian descriptor for 3D object retrieval
    Zan Gao
    Kai-Xin Xue
    Hua Zhang
    [J]. Multimedia Tools and Applications, 2019, 78 : 555 - 572
  • [45] 3D Object Localisation from Multi-View Image Detections
    Rubino, Cosimo
    Crocco, Marco
    Del Bue, Alessio
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (06) : 1281 - 1294
  • [46] Multi-view dual attention network for 3D object recognition
    Wang, Wenju
    Cai, Yu
    Wang, Tao
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (04): : 3201 - 3212
  • [47] Multi-view Harmonized Bilinear Network for 3D Object Recognition
    Yu, Tan
    Meng, Jingjing
    Yuan, Junsong
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 186 - 194
  • [48] Prior-Guided Multi-View 3D Head Reconstruction
    Wang, Xueying
    Guo, Yudong
    Yang, Zhongqi
    Zhang, Juyong
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4028 - 4040
  • [49] SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection
    Zhang, Jinqing
    Zhang, Yanan
    Liu, Qingjie
    Wang, Yunhong
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3325 - 3334
  • [50] Multi-view Manhole Detection, Recognition, and 3D Localisation
    Timofte, Radu
    Van Gool, Luc
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,