3D Object Detection with Fusion Point Attention Mechanism in LiDAR Point Cloud

被引：1

作者：

Liu Weili ^{[1
,2
]}

Zhu Deli ^{[1
,2
]}

Luo Huahao ^{[1
,2
]}

Li Yi ^{[3
]}

机构：

[1] Chongqing Normal Univ, Sch Comp & Informat Sci, Chongqing 401331, Peoples R China

[2] Chongqing Digital Agr Serv Engn Technol Res Ctr, Chongqing 401331, Peoples R China

[3] Chongqing Acad Anim Husb, Informat Ctr, Chongqing 401331, Peoples R China

来源：

ACTA PHOTONICA SINICA | 2023年 / 52卷 / 09期

关键词：

3D object detection; Point cloud; Attention mechanism; PointPillars; Cross stage partial network;

D O I：

10.3788/gzxb20235209.0912002

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

With the rapid development of computer vision, object detection has made remarkable achievements in 2D vision tasks, but it still can not solve the problems such as light changes and lack of depth that occur in actual scenes. The 3D data acquired by LiDAR makes up for some defects existing in the 2D vision field, so 3D object detection is widely studied as an important field in 3D scene perception. 3D object detection in the field of autonomous driving is an important part of intelligent transportation, and the 3D object detection algorithm based on LiDAR point cloud provides an important perception means for it. Perception is a key component of autonomous driving, ensuring the intelligence and safety of driving. 3D object detection refers to the detection of physical objects from sensor data, predicting and estimating the category, bounding box, and spatial position of the target. However, due to the unstructured and non-fixed size characteristics of point clouds, they can not be directly processed by 3D object detectors and must be encoded into a more compact structure through some form of expression. There are currently two main types of expressions:point-based and voxel-based methods. Voxel-based methods have higher detection efficiency, but their detection accuracy is lower than that of methods based on raw point clouds. Therefore, how to improve the detection accuracy of voxel-based methods while ensuring detection efficiency has become a research hotspot in recent years. In view of the problems of loss of fine-grained information and insufficient ability to extract point cloud features in the 3D object detection algorithm for Pillar-encoded point clouds, this paper proposes a 3D object detection algorithm based on PointPillars that integrates point-wise spatial attention mechanism and CSPNet. Firstly, the point-wise spatial attention mechanism is integrated into the pillar feature network layer, which can enhance the network's ability to extract local geometric information and retain deep-level information, making the obtained key features more suitable for detection tasks. Point-wise spatial attention follows the basic structure of self-attention, which can effectively avoid the impact of redundant point clouds or noise points on features, strengthen the description of features with less coverage of point clouds, and to a certain extent solve the problem of information loss based on Pillar-encoded point clouds. Secondly, replacing the ordinary convolution in the downsampling module that extracts high-dimensional features from pseudo-images of point clouds with CSPNet can achieve gradient flow segmentation, further enhance the network's learning ability while reducing computational complexity, and improve model detection accuracy. Finally, the algorithm in this paper improves the 3D detection accuracy by 2.23%, 2.25%, and 2.30% in easy, medium, and hard cases, respectively, compared with the benchmark network under the application scenario of highway with car class in KITTI as the detection target. The experimental results show that the algorithm in this paper has significantly improved the detection performance, while the detection speed reaches the real-time detection level, which has some positive significance for the optimization and improvement of autonomous driving technology, and has great potential in the application scenario of highways.

引用

页数：11

共 31 条

[1] Multi-View 3D Object Detection Network for Autonomous Driving
Chen, Xiaozhi
Ma, Huimin
Wan, Ji
Li, Bo
Xia, Tian
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6526 - 6534
[2] Chenhang He, 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Proceedings, P11870, DOI 10.1109/CVPR42600.2020.01189
[3] Point attention network for semantic segmentation of 3D point clouds
Feng, Mingtao
Zhang, Liang
Lin, Xuefei
Gilani, Syed Zulqarnain
Mian, Ajmal
[J]. PATTERN RECOGNITION, 2020, 107 (107)
[4] Dual Attention Network for Scene Segmentation
Fu, Jun
Liu, Jing
Tian, Haijie
Li, Yong
Bao, Yongjun
Fang, Zhiwei
Lu, Hanqing
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
[5] Vision meets robotics: The KITTI dataset
Geiger, A.
Lenz, P.
Stiller, C.
Urtasun, R.
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11): : 1231 - 1237
[6] HAN Zhuang, 2023, Journal of Qiqihar University, V39, P25
[7] HAN Zhuang, 2023, Journal of Qiqihar University, V39, P43
[8] PointPillars: Fast Encoders for Object Detection from Point Clouds
Lang, Alex H.
Vora, Sourabh
Caesar, Holger
Zhou, Lubing
Yang, Jiong
Beijbom, Oscar
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12689 - 12697
[9] [李悄 Li Qiao], 2022, [西安交通大学学报, Journal of Xi'an Jiaotong University], V56, P112
[10] 3D object detection in voxelized point cloud scene
Li Rui-long
Wu Chuan
Zhu Ming
[J]. CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2022, 37 (10) : 1355 - 1363

← 1 2 3 4 →