MSPV3D: Multi-Scale Point-Voxels 3D Object Detection Net

被引:0
|
作者
Zhang, Zheng [1 ]
Bao, Zhiping [1 ]
Wei, Yun [2 ]
Zhou, Yongsheng [3 ]
Li, Ming [2 ]
Tian, Qing [1 ]
机构
[1] North China Univ Technol, Sch Informat, Beijing 100144, Peoples R China
[2] Beijing Mass Transit Railway Operat Co Ltd, Corp Informat, Beijing 100044, Peoples R China
[3] Beijing Univ Chem Technol, Sch Informat, Beijing 100029, Peoples R China
关键词
target detection; target recognition; deep learning;
D O I
10.3390/rs16173146
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Autonomous vehicle technology is advancing, with 3D object detection based on point clouds being crucial. However, point clouds' irregularity, sparsity, and large data volume, coupled with irrelevant background points, hinder detection accuracy. We propose a two-stage multi-scale 3D object detection network. Firstly, considering that a large number of useless background points are usually generated by the ground during detection, we propose a new ground filtering algorithm to increase the proportion of foreground points and enhance the accuracy and efficiency of the two-stage detection. Secondly, given that different types of targets to be detected vary in size, and the use of a single-scale voxelization may result in excessive loss of detailed information, the voxels of different scales are introduced to extract relevant features of objects of different scales in the point clouds and integrate them into the second-stage detection. Lastly, a multi-scale feature fusion module is proposed, which simultaneously enhances and integrates features extracted from voxels of different scales. This module fully utilizes the valuable information present in the point cloud across various scales, ultimately leading to more precise 3D object detection. The experiment is conducted on the KITTI dataset and the nuScenes dataset. Compared with our baseline, "Pedestrian" detection improved by 3.37-2.72% and "Cyclist" detection by 3.79-1.32% across difficulty levels on KITTI, and was boosted by 2.4% in NDS and 3.6% in mAP on nuScenes.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Enhanced frustrum multi-scale VoteNet for 3D object detection in cluttered indoor scene
    Zhang, Xuesong
    He, Yu
    Song, Cunli
    Zhuang, Yan
    APPLIED INTELLIGENCE, 2025, 55 (07)
  • [22] 3D Object Detection Algorithm for Panoramic Images With Multi-Scale Convolutional Neural Network
    Wang, Dianwei
    He, Yanhui
    Liu, Ying
    Li, Daxiang
    Wu, Shiqian
    Qin, Yongrui
    Xu, Zhijie
    IEEE ACCESS, 2019, 7 : 171461 - 171470
  • [23] MS23D: 2 3D: A 3D object detection method using multi-scale semantic feature points to construct 3D feature layer
    Shao, Yongxin
    Tan, Aihong
    Yan, Tianhong
    Sun, Zhetao
    Liu, Jiaxin
    NEURAL NETWORKS, 2024, 179
  • [24] Enhanced frustrum multi-scale VoteNet for 3D object detection in cluttered indoor sceneEnhanced frustrum multi-scale VoteNet for 3D object detection in cluttered indoor sceneX. Zhang et al.
    Xuesong Zhang
    Yu He
    Cunli Song
    Yan Zhuang
    Applied Intelligence, 2025, 55 (7)
  • [25] A Multi-scale Network for Semantic Segmentation of 3D Point Clouds
    He, Ying
    Xiao, Li
    Jiang, Yong
    Sun, Zhigang
    Wang, Zhuo
    Peng, Gang
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 4113 - 4118
  • [26] 3DMAX-Net: A Multi-Scale Spatial Contextual Network for 3D Point Cloud Semantic Segmentation
    Ma, Yanxin
    Guo, Yulan
    Lei, Yinjie
    Lu, Min
    Zhang, Jun
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1560 - 1566
  • [27] Multi-scale point pair normal encoding for local feature description and 3D object recognition
    Zhang, Chu'ai
    Wang, Yating
    Wu, Qiao
    Zheng, Jiangbin
    Yang, Jiaqi
    Quan, Siwen
    Zhang, Yanning
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (04)
  • [28] BSM-NET: multi-bandwidth, multi-scale and multi-modal fusion network for 3D object detection of 4D radar and LiDAR
    Jiang, Tiezhen
    Kang, Runjie
    Li, Qingzhu
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (03)
  • [29] Voxel-FPN: Multi-Scale Voxel Feature Aggregation for 3D Object Detection from LIDAR Point Clouds
    Kuang, Hongwu
    Wang, Bei
    An, Jianping
    Zhang, Ming
    Zhang, Zehan
    SENSORS, 2020, 20 (03)
  • [30] 3D terrestrial LIDAR classifications with super-voxels and multi-scale Conditional Random Fields
    Lim, Ee Hui
    Suter, David
    COMPUTER-AIDED DESIGN, 2009, 41 (10) : 701 - 710