Interactive Multi-Scale Fusion of 2D and 3D Features for Multi-Object Vehicle Tracking

被引:14
|
作者
Wang, Guangming [1 ,2 ]
Peng, Chensheng [1 ,2 ]
Gu, Yingying [2 ]
Zhang, Jinpeng [3 ]
Wang, Hesheng [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai Engn Res Ctr Intelligent Control & Manage, Key Lab Marine Intelligent Equipment & Syst, Dept Automat,Key Lab Syst Control & Informat Proc,, Shanghai 200240, Peoples R China
[2] Beijing Inst Control Engn, Space Optoelect Measurement & Percept Lab, Beijing 100190, Peoples R China
[3] China Aerosp Sci & Ind Corp, X Lab, Acad 2, Beijing 100854, Peoples R China
关键词
Multi object tracking; 3D point clouds; feature fusion; computer vision; deep learning;
D O I
10.1109/TITS.2023.3275954
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Multiple Object Tracking (MOT) is a significant task in autonomous driving. Nonetheless, relying on one single sensor is not robust enough, because one modality tends to fail in some challenging situations. Texture information from RGB cameras and 3D structure information from Light Detection and Ranging (LiDAR) have respective advantages under different circumstances. Therefore, feature fusion from multiple modalities contributes to the learning of discriminative features. However, it is nontrivial to achieve effective feature fusion due to the completely distinct information modality. Previous fusion methods usually fuse the top-level features after the backbones extract the features from different modalities. The feature fusion happens solely once, which limits the information interaction between different modalities. In this paper, we propose multiscale interactive query and fusion between pixel-wise and point-wise features to obtain more discriminative features. In addition, an attention mechanism is utilized to conduct soft feature fusion between multiple pixels and points to avoid inaccurate match problems of previous single pixel-point fusion methods. We introduce PointNet++ to obtain multi-scale deep representations of point clouds and make it adaptive to our proposed interactive feature fusion between multi-scale features of images and point clouds. Through the interaction module, each modality can integrate more complementary information from the other modality. Besides, we explore the effectiveness of pre-training on each single modality and fine-tuning on the fusion-based model. Our method can achieve 90.32% MOTA and 72.44% HOTA on the KITTI benchmark and outperform other approaches without using multi-scale soft feature fusion.
引用
收藏
页码:10618 / 10627
页数:10
相关论文
共 50 条
  • [1] Wildlife 3D multi-object tracking
    Klasen, Morris
    Steinhage, Volker
    ECOLOGICAL INFORMATICS, 2022, 71
  • [2] EagerMOT: 3D Multi-Object Tracking via Sensor Fusion
    Kim, Aleksandr
    Osep, Aljosa
    Leal-Taixe, Laura
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 11315 - 11321
  • [3] A Multi-Level Eigenvalue Fusion Algorithm for 3D Multi-Object Tracking
    Liu, Hantao
    Hu, Jianming
    Li, Xingyu
    Peng, Lihui
    INTERNATIONAL CONFERENCE ON TRANSPORTATION AND DEVELOPMENT 2022: APPLICATION OF EMERGING TECHNOLOGIES, 2022, : 235 - 245
  • [4] Multi-Object tracking of 3D cuboids using aggregated features
    Muresan, Mircea Paul
    Nedevschi, Sergiu
    2019 IEEE 15TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP 2019), 2019, : 11 - 18
  • [5] 3D Multi-Object Tracking Based on Radar-Camera Fusion
    Lin, Zihao
    Hu, Jianming
    2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, : 2502 - 2507
  • [6] A Multi-Scale Feature-Fusion Multi-Object Tracking Algorithm for Scale-Variant Vehicle Tracking in UAV Videos
    Liu, Shanshan
    Shen, Xinglin
    Xiao, Shanzhu
    Li, Hanwen
    Tao, Huamin
    REMOTE SENSING, 2025, 17 (06)
  • [7] Deep multi-scale and multi-modal fusion for 3D object detection
    Guo, Rui
    Li, Deng
    Han, Yahong
    PATTERN RECOGNITION LETTERS, 2021, 151 : 236 - 242
  • [8] Multi-Scale Behavior Learning for Multi-Object Tracking
    Liu Wancun
    Tang Wenyan
    Zhang Liguo
    Zhang Xiaolin
    Li Jiafu
    PROCEEDINGS FIRST INTERNATIONAL CONFERENCE ON ELECTRONICS INSTRUMENTATION & INFORMATION SYSTEMS (EIIS 2017), 2017, : 639 - 643
  • [9] InterTrack: Interaction Transformer for 3D Multi-Object Tracking
    Willes, John
    Reading, Cody
    Waslander, Steven L.
    2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 73 - 80
  • [10] Integrated Detection and Tracking Framework for 3D Multi-Object Tracking in Vehicle-Infrastructure Cooperation
    Hu, Tao
    Wang, Ping
    Wang, Xinhong
    International Journal of Advanced Computer Science and Applications, 2024, 15 (11) : 1228 - 1237