Multi-view semantic learning network for point cloud based 3D object detection

被引:22
|
作者
Yang, Yongguang [1 ]
Chen, Feng [1 ]
Wu, Fei [1 ,2 ]
Zeng, Deliang [3 ]
Ji, Yi-mu [2 ,4 ]
Jing, Xiao-Yuan [1 ,5 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing, Peoples R China
[2] Jiangsu HPC & Intelligent Proc Engineer Res Ctr, Nanjing, Peoples R China
[3] North China Elect Power Univ, State Key Lab Alternate Elect Power Syst Renewabl, Beijing, Peoples R China
[4] Nanjing Ctr HPC China, Nanjing, Peoples R China
[5] Wuhan Univ, Sch Comp, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
3D object detection; LIDAR point cloud; Semantic feature; Deep learning;
D O I
10.1016/j.neucom.2019.10.116
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Point cloud based 3D objection plays a crucial role in real-world applications, such as autonomous driving. In this paper, we propose the Multi-view Semantic Learning Network (MVSLN) for 3D object detection, an approach considering the feature discrimination for LIDAR point cloud. Since the discrete and disordered nature of point cloud, most existing methods ignore the low-level information and focus more on the spatial details of point cloud. To capture the discriminative feature of objects, our MVSLN takes advantages of both spatial and low-level details to further exploit semantic information. Specifically, the Multiple Views Generator (MVG) module in our approach observes the scene from four views by projecting the 3D point cloud to planes with specific angles, which preserves much more low-level features, e.g., texture and edge. To correct the deviation brought by different projection angles, the Spatial Recalibration Fusion (SRF) operation in our approach adjusts the locations of features of these four views, enabling the interaction between different projections. Then the recalibrated features of SRF are sent to the developed 3D Region Proposal Network (RPN) to detect objects. The experimental results on challenging KITTI benchmark verify that our approach achieves a promising performance and outperforms state-of-the-art methods. Furthermore, the discriminative feature extractor brought by exploiting the conspicuous semantic information, leads to encouraging results in the hard-level difficulty of both BEV and 3D object detection tasks, without any help of camera image. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:477 / 485
页数:9
相关论文
共 50 条
  • [1] MVPointNet: Multi-View Network for 3D Object Based on Point Cloud
    Zhou, Weiguo
    Jiang, Xin
    Liu, Yun-Hui
    [J]. IEEE SENSORS JOURNAL, 2019, 19 (24) : 12145 - 12152
  • [2] 3D Point Cloud Object Detection with Multi-View Convolutional Neural Network
    Pang, Guan
    Neumann, Ulrich
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 585 - 590
  • [3] 3D Object Detection based on Multi-View Feature Point Matching
    Yang, Tian
    Sang, Xinzhu
    Chen, Duo
    Guo, Nan
    Wang, Peng
    Yu, Xunbo
    Yan, Binbin
    Wang, Kuiru
    Yu, Chongxiu
    [J]. AI IN OPTICS AND PHOTONICS (AOPC 2019), 2019, 11342
  • [4] 3D Point Cloud Recognition Based on a Multi-View Convolutional Neural Network
    Zhang, Le
    Sun, Jian
    Zheng, Qiang
    [J]. SENSORS, 2018, 18 (11)
  • [5] Multi-View 3D Object Detection Network for Autonomous Driving
    Chen, Xiaozhi
    Ma, Huimin
    Wan, Ji
    Li, Bo
    Xia, Tian
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6526 - 6534
  • [6] Improving Point Cloud Semantic Segmentation by Learning 3D Object Detection
    Unal, Ozan
    Van Gool, Luc
    Dai, Dengxin
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2949 - 2958
  • [7] Transfer Learning Based Semantic Segmentation for 3D Object Detection from Point Cloud
    Imad, Muhammad
    Doukhi, Oualid
    Lee, Deok-Jin
    [J]. SENSORS, 2021, 21 (12)
  • [8] SOGDet: Semantic-Occupancy Guided Multi-View 3D Object Detection
    Zhou, Qiu
    Cao, Jinming
    Leng, Hanchao
    Yin, Yifang
    Kun, Yu
    Zimmermann, Roger
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7668 - 7676
  • [9] Object Detection in Multi-view 3D Reconstruction Using Semantic and Geometric Context
    Weinshall, D.
    Golbert, A.
    [J]. CMRT13 - CITY MODELS, ROADS AND TRAFFIC 2013, 2013, II-3/W3 : 97 - 102
  • [10] Multi-view Network with Transformer for Point Cloud Semantic Segmentation
    Hua, Zhongwei
    Du, Daming
    [J]. 6TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE, ICIAI2022, 2022, : 161 - 165