Fully Sparse 3D Occupancy Prediction

被引:0
|
作者
Liu, Haisong [1 ,2 ]
Chen, Yang [1 ]
Wang, Haiguang [1 ]
Yang, Zetong [2 ]
Li, Tianyu [2 ]
Zeng, Jia [2 ]
Chen, Li [2 ]
Li, Hongyang [2 ]
Wang, Limin [1 ,2 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Shanghai AI Lab, Shanghai, Peoples R China
来源
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
3D Occupancy Estimation; Semantic Scene Completion; 3D Reconstruction; Autonomous Driving;
D O I
10.1007/978-3-031-72698-9_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Occupancy prediction plays a pivotal role in autonomous driving. Previous methods typically construct dense 3D volumes, neglecting the inherent sparsity of the scene and suffering high computational costs. To bridge the gap, we introduce a novel fully sparse occupancy network, termed SparseOcc. SparseOcc initially reconstructs a sparse 3D representation from visual inputs and subsequently predicts semantic/instance occupancy from the 3D sparse representation by sparse queries. A mask-guided sparse sampling is designed to enable sparse queries to interact with 2D features in a fully sparse manner, thereby circumventing costly dense features or global attention. Additionally, we design a thoughtful ray-based evaluation metric, namely RayIoU, to solve the inconsistency penalty along depths raised in traditional voxel-level mIoU criteria. SparseOcc demonstrates its effectiveness by achieving a RayIoU of 34.0, while maintaining a real-time inference speed of 17.3 FPS, with 7 history frames inputs. By incorporating more preceding frames to 15, SparseOcc continuously improves its performance to 35.1 RayIoU without bells and whistles. Code is available at https://github. com/MCG- NJU/SparseOcc.
引用
收藏
页码:54 / 71
页数:18
相关论文
共 50 条
  • [21] CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
    Ye, Zhangchen
    Jiang, Tao
    Xu, Chenfeng
    Li, Yiming
    Zhao, Hang
    COMPUTER VISION - ECCV 2024, PT LXXIII, 2025, 15131 : 381 - 397
  • [22] CLIP prior-guided 3D open-vocabulary occupancy prediction
    Zhang, Zongkai
    Gao, Bin
    Ye, Jingrui
    Jin, Huan
    Jiang, Lihui
    Yang, Wenming
    PATTERN RECOGNITION, 2025, 162
  • [23] RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision
    Pan, Mingjie
    Liu, Laming
    Zhang, Renrui
    Huang, Peixiang
    Li, Xiaoqi
    Xie, Hongwei
    Wang, Bing
    Liu, Li
    Zhang, Shanghang
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 12404 - 12411
  • [24] Sparse regression interaction models for spatial prediction of soil properties in 3D
    Pejovic, Milutin
    Nikolic, Mladen
    Heuvelink, Gerard B. M.
    Hengl, Tomislav
    Kilibarda, Milan
    Bajat, Branislav
    COMPUTERS & GEOSCIENCES, 2018, 118 : 1 - 13
  • [25] 3D surface-related multiple prediction: A sparse inversion approach
    van Dedem, EJ
    Verschuur, DJ
    GEOPHYSICS, 2005, 70 (03) : V31 - V43
  • [26] Pareto Dose Prediction Using Fully Convolutional Networks Operating in 3D
    Nordstrom, M.
    Hult, H.
    Maki, A.
    Lofman, F.
    MEDICAL PHYSICS, 2018, 45 (06) : E176 - E176
  • [27] Pre-Occupancy Evaluation based on user behavior prediction in 3D virtual simulation
    Shin, Sangyun
    Jeong, Sangah
    Lee, Jaewook
    Hong, Seung Wan
    Jung, Sungwon
    AUTOMATION IN CONSTRUCTION, 2017, 74 : 55 - 65
  • [28] Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction
    Huang, Yuanhui
    Zheng, Wenzhao
    Zhang, Yunpeng
    Zhou, Jie
    Lu, Jiwen
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9223 - 9232
  • [29] Bi-disparity sparse feature learning for 3D visual discomfort prediction
    Karimi, Maryam
    Nejati, Mansour
    Lin, Weisi
    SIGNAL PROCESSING, 2021, 188
  • [30] Infilling of sparse 3D data for 3D focusing operator estimation
    van de Rijzen, MJ
    Gisolf, A
    Verschuur, DJ
    GEOPHYSICAL PROSPECTING, 2004, 52 (06) : 489 - 507