3D Object Detection with Fusion Point Attention Mechanism in LiDAR Point Cloud

被引:1
|
作者
Liu Weili [1 ,2 ]
Zhu Deli [1 ,2 ]
Luo Huahao [1 ,2 ]
Li Yi [3 ]
机构
[1] Chongqing Normal Univ, Sch Comp & Informat Sci, Chongqing 401331, Peoples R China
[2] Chongqing Digital Agr Serv Engn Technol Res Ctr, Chongqing 401331, Peoples R China
[3] Chongqing Acad Anim Husb, Informat Ctr, Chongqing 401331, Peoples R China
关键词
3D object detection; Point cloud; Attention mechanism; PointPillars; Cross stage partial network;
D O I
10.3788/gzxb20235209.0912002
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
With the rapid development of computer vision, object detection has made remarkable achievements in 2D vision tasks, but it still can not solve the problems such as light changes and lack of depth that occur in actual scenes. The 3D data acquired by LiDAR makes up for some defects existing in the 2D vision field, so 3D object detection is widely studied as an important field in 3D scene perception. 3D object detection in the field of autonomous driving is an important part of intelligent transportation, and the 3D object detection algorithm based on LiDAR point cloud provides an important perception means for it. Perception is a key component of autonomous driving, ensuring the intelligence and safety of driving. 3D object detection refers to the detection of physical objects from sensor data, predicting and estimating the category, bounding box, and spatial position of the target. However, due to the unstructured and non-fixed size characteristics of point clouds, they can not be directly processed by 3D object detectors and must be encoded into a more compact structure through some form of expression. There are currently two main types of expressions:point-based and voxel-based methods. Voxel-based methods have higher detection efficiency, but their detection accuracy is lower than that of methods based on raw point clouds. Therefore, how to improve the detection accuracy of voxel-based methods while ensuring detection efficiency has become a research hotspot in recent years. In view of the problems of loss of fine-grained information and insufficient ability to extract point cloud features in the 3D object detection algorithm for Pillar-encoded point clouds, this paper proposes a 3D object detection algorithm based on PointPillars that integrates point-wise spatial attention mechanism and CSPNet. Firstly, the point-wise spatial attention mechanism is integrated into the pillar feature network layer, which can enhance the network's ability to extract local geometric information and retain deep-level information, making the obtained key features more suitable for detection tasks. Point-wise spatial attention follows the basic structure of self-attention, which can effectively avoid the impact of redundant point clouds or noise points on features, strengthen the description of features with less coverage of point clouds, and to a certain extent solve the problem of information loss based on Pillar-encoded point clouds. Secondly, replacing the ordinary convolution in the downsampling module that extracts high-dimensional features from pseudo-images of point clouds with CSPNet can achieve gradient flow segmentation, further enhance the network's learning ability while reducing computational complexity, and improve model detection accuracy. Finally, the algorithm in this paper improves the 3D detection accuracy by 2.23%, 2.25%, and 2.30% in easy, medium, and hard cases, respectively, compared with the benchmark network under the application scenario of highway with car class in KITTI as the detection target. The experimental results show that the algorithm in this paper has significantly improved the detection performance, while the detection speed reaches the real-time detection level, which has some positive significance for the optimization and improvement of autonomous driving technology, and has great potential in the application scenario of highways.
引用
收藏
页数:11
相关论文
共 31 条
  • [1] Multi-View 3D Object Detection Network for Autonomous Driving
    Chen, Xiaozhi
    Ma, Huimin
    Wan, Ji
    Li, Bo
    Xia, Tian
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6526 - 6534
  • [2] Chenhang He, 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Proceedings, P11870, DOI 10.1109/CVPR42600.2020.01189
  • [3] Point attention network for semantic segmentation of 3D point clouds
    Feng, Mingtao
    Zhang, Liang
    Lin, Xuefei
    Gilani, Syed Zulqarnain
    Mian, Ajmal
    [J]. PATTERN RECOGNITION, 2020, 107 (107)
  • [4] Dual Attention Network for Scene Segmentation
    Fu, Jun
    Liu, Jing
    Tian, Haijie
    Li, Yong
    Bao, Yongjun
    Fang, Zhiwei
    Lu, Hanqing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
  • [5] Vision meets robotics: The KITTI dataset
    Geiger, A.
    Lenz, P.
    Stiller, C.
    Urtasun, R.
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11): : 1231 - 1237
  • [6] HAN Zhuang, 2023, Journal of Qiqihar University, V39, P25
  • [7] HAN Zhuang, 2023, Journal of Qiqihar University, V39, P43
  • [8] PointPillars: Fast Encoders for Object Detection from Point Clouds
    Lang, Alex H.
    Vora, Sourabh
    Caesar, Holger
    Zhou, Lubing
    Yang, Jiong
    Beijbom, Oscar
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12689 - 12697
  • [9] [李悄 Li Qiao], 2022, [西安交通大学学报, Journal of Xi'an Jiaotong University], V56, P112
  • [10] 3D object detection in voxelized point cloud scene
    Li Rui-long
    Wu Chuan
    Zhu Ming
    [J]. CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2022, 37 (10) : 1355 - 1363