Multimodal 3D Object Detection from Simulated Pretraining

被引:8
|
作者
Brekke, Asmund [1 ]
Vatsendvik, Fredrik [1 ]
Lindseth, Frank [1 ]
机构
[1] Norwegian Univ Sci & Technol, Trondheim, Norway
关键词
Autonomous driving; Simulated data; 3D object detection; CARLA; KITTI; AVOD-FPN; LIDAR; Sensor fusion;
D O I
10.1007/978-3-030-35664-4_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The need for simulated data in autonomous driving applications has become increasingly important, both for validation of pre-trained models and for training new models. In order for these models to generalize to real-world applications, it is critical that the underlying dataset contains a variety of driving scenarios and that simulated sensor readings closely mimics real-world sensors. We present the Carla Automated Dataset Extraction Tool (CADET), a novel tool for generating training data from the CARLA simulator to be used in autonomous driving research. The tool is able to export high-quality, synchronized LIDAR and camera data with object annotations, and offers configuration to accurately reflect a real-life sensor array. Furthermore, we use this tool to generate a dataset consisting of 10 000 samples and use this dataset in order to train the 3D object detection network AVOD-FPN, with finetuning on the KITTI dataset in order to evaluate the potential for effective pretraining. We also present two novel LIDAR feature map configurations in Bird's Eye View for use with AVOD-FPN that can be easily modified. These configurations are tested on the KITTI and CADET datasets in order to evaluate their performance as well as the usability of the simulated dataset for pretraining. Although insufficient to fully replace the use of real world data, and generally not able to exceed the performance of systems fully trained on real data, our results indicate that simulated data can considerably reduce the amount of training on real data required to achieve satisfactory levels of accuracy.
引用
收藏
页码:102 / 113
页数:12
相关论文
共 50 条
  • [1] Multimodal 3D Histogram for Moving Object Detection
    Mukherjee, Dibyendu
    Saha, Ashirbani
    Wu, Q. M. Jonathan
    Jiang, Wei
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 2397 - 2402
  • [2] Virtual Sparse Convolution for Multimodal 3D Object Detection
    Wu, Hai
    Wen, Chenglu
    Shi, Shaoshuai
    Li, Xin
    Wang, Cheng
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21653 - 21662
  • [3] Multimodal Transformer for Automatic 3D Annotation and Object Detection
    Liu, Chang
    Qian, Xiaoyan
    Huang, Binxiao
    Qi, Xiaojuan
    Lam, Edmund
    Tan, Siew-Chong
    Wong, Ngai
    [J]. COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 657 - 673
  • [4] FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection
    Xu, Shaoqing
    Zhou, Dingfu
    Fang, Jin
    Yin, Junbo
    Bin, Zhou
    Zhang, Liangjun
    [J]. 2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3047 - 3054
  • [5] Real-Time Multimodal 3D Object Detection with Transformers
    Liu, Hengsong
    Duan, Tongle
    [J]. WORLD ELECTRIC VEHICLE JOURNAL, 2024, 15 (07):
  • [6] MVX-Net: Multimodal VoxelNet for 3D Object Detection
    Sindagi, Vishwanath A.
    Zhou, Yin
    Tuzel, Oncel
    [J]. 2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 7276 - 7282
  • [7] VirPNet: A Multimodal Virtual Point Generation Network for 3D Object Detection
    Wang, Lin
    Sun, Shiliang
    Zhao, Jing
    [J]. IEEE Transactions on Multimedia, 2024, 26 : 10597 - 10609
  • [8] 3D Object Detection and Localization using Multimodal Point Pair Features
    Drost, Bertram
    Ilic, Slobodan
    [J]. SECOND JOINT 3DIM/3DPVT CONFERENCE: 3D IMAGING, MODELING, PROCESSING, VISUALIZATION & TRANSMISSION (3DIMPVT 2012), 2012, : 9 - 16
  • [9] MMFG: Multimodal-based Mutual Feature Gating 3D Object Detection
    Xu, Wanpeng
    Fu, Zhipeng
    [J]. JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2024, 110 (02)
  • [10] Homogenous multimodal 3D object detection based on deformable Transformer and attribute dependencies
    Dong, Yue
    Li, Xingfeng
    He, Hua
    [J]. PROCEEDINGS OF 2024 3RD INTERNATIONAL CONFERENCE ON CYBER SECURITY, ARTIFICIAL INTELLIGENCE AND DIGITAL ECONOMY, CSAIDE 2024, 2024, : 346 - 351