Interactive Multi-Scale Fusion of 2D and 3D Features for Multi-Object Vehicle Tracking

被引:14
|
作者
Wang, Guangming [1 ,2 ]
Peng, Chensheng [1 ,2 ]
Gu, Yingying [2 ]
Zhang, Jinpeng [3 ]
Wang, Hesheng [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai Engn Res Ctr Intelligent Control & Manage, Key Lab Marine Intelligent Equipment & Syst, Dept Automat,Key Lab Syst Control & Informat Proc,, Shanghai 200240, Peoples R China
[2] Beijing Inst Control Engn, Space Optoelect Measurement & Percept Lab, Beijing 100190, Peoples R China
[3] China Aerosp Sci & Ind Corp, X Lab, Acad 2, Beijing 100854, Peoples R China
关键词
Multi object tracking; 3D point clouds; feature fusion; computer vision; deep learning;
D O I
10.1109/TITS.2023.3275954
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Multiple Object Tracking (MOT) is a significant task in autonomous driving. Nonetheless, relying on one single sensor is not robust enough, because one modality tends to fail in some challenging situations. Texture information from RGB cameras and 3D structure information from Light Detection and Ranging (LiDAR) have respective advantages under different circumstances. Therefore, feature fusion from multiple modalities contributes to the learning of discriminative features. However, it is nontrivial to achieve effective feature fusion due to the completely distinct information modality. Previous fusion methods usually fuse the top-level features after the backbones extract the features from different modalities. The feature fusion happens solely once, which limits the information interaction between different modalities. In this paper, we propose multiscale interactive query and fusion between pixel-wise and point-wise features to obtain more discriminative features. In addition, an attention mechanism is utilized to conduct soft feature fusion between multiple pixels and points to avoid inaccurate match problems of previous single pixel-point fusion methods. We introduce PointNet++ to obtain multi-scale deep representations of point clouds and make it adaptive to our proposed interactive feature fusion between multi-scale features of images and point clouds. Through the interaction module, each modality can integrate more complementary information from the other modality. Besides, we explore the effectiveness of pre-training on each single modality and fine-tuning on the fusion-based model. Our method can achieve 90.32% MOTA and 72.44% HOTA on the KITTI benchmark and outperform other approaches without using multi-scale soft feature fusion.
引用
收藏
页码:10618 / 10627
页数:10
相关论文
共 50 条
  • [41] Multi-Scale Salient Features for Analyzing 3D Shapes
    Yong-Liang Yang
    Chao-Hui Shen
    Journal of Computer Science and Technology, 2012, 27 : 1092 - 1099
  • [42] Multi-Scale Salient Features for Analyzing 3D Shapes
    杨永亮
    沈超慧
    JournalofComputerScience&Technology, 2012, 27 (06) : 1092 - 1099
  • [43] Multi-Scale Salient Features for Analyzing 3D Shapes
    Yang, Yong-Liang
    Shen, Chao-Hui
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2012, 27 (06) : 1092 - 1099
  • [44] 3DMOTFormer: Graph Transformer for Online 3D Multi-Object Tracking
    Ding, Shuxiao
    Rehder, Eike
    Schneider, Lukas
    Cordts, Marius
    Gall, Juergen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 9750 - 9760
  • [45] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
    Wang, Li
    Zhang, Xinyu
    Li, Jun
    Xv, Baowei
    Fu, Rong
    Chen, Haifeng
    Yang, Lei
    Jin, Dafeng
    Zhao, Lijun
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641
  • [46] Statistical analysis and comparative study of multi-scale 2D and 3D shape features for unbound granular geomaterials
    Zhao, Lianheng
    Zhang, Shuaihao
    Deng, Min
    Wang, Xiang
    TRANSPORTATION GEOTECHNICS, 2021, 26
  • [47] From Points to Multi-Object 3D Reconstruction
    Engelmann, Francis
    Rematas, Konstantinos
    Leibe, Bastian
    Ferrari, Vittorio
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4586 - 4595
  • [48] A System for Generalized 3D Multi-Object Search
    Zheng, Kaiyu
    Paul, Anirudha
    Tellex, Stefanie
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 1638 - 1644
  • [49] Multi-Object Tracking Framework Based on Multi-Scale Temporal Feature Aggregation
    Liu, Jialiang
    Hu, Xiaopeng
    2023 3rd International Conference on Electronic Information Engineering and Computer, EIECT 2023, 2023, : 68 - 72
  • [50] DeepFusionMOT: A 3D Multi-Object Tracking Framework Based on Camera-LiDAR Fusion With Deep Association
    Wang, Xiyang
    Fu, Chunyun
    Li, Zhankun
    Lai, Ying
    He, Jiawei
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03): : 8260 - 8267