M3SOT: Multi-Frame, Multi-Field, Multi-Space 3D Single Object Tracking

被引:0
|
作者
Liu, Jiaming [1 ]
Wu, Yue [1 ]
Gong, Maoguo [1 ]
Miao, Qiguang [1 ]
Ma, Wenping [1 ]
Xu, Cai [1 ]
Qin, Can [2 ]
机构
[1] Xidian Univ, Xian, Peoples R China
[2] Northeastern Univ, Boston, MA USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D Single Object Tracking (SOT) stands a forefront task of computer vision, proving essential for applications like autonomous driving. Sparse and occluded data in scene point clouds introduce variations in the appearance of tracked objects, adding complexity to the task. In this research, we unveil M3SOT, a novel 3D SOT framework, which synergizes multiple input frames (template sets), multiple receptive fields (continuous contexts), and multiple solution spaces (distinct tasks) in ONE model. Remarkably, M3SOT pioneers in modeling temporality, contexts, and tasks directly from point clouds, revisiting a perspective on the key factors influencing SOT. To this end, we design a transformer-based network centered on point cloud targets in the search area, aggregating diverse contextual representations and propagating target cues by employing historical frames. As M3SOT spans varied processing perspectives, we've streamlined the network-trimming its depth and optimizing its structure-to ensure a lightweight and efficient deployment for SOT applications. We posit that, backed by practical construction, M3SOT sidesteps the need for complex frameworks and auxiliary components to deliver sterling results. Extensive experiments on benchmarks such as KITTI, nuScenes, and Waymo Open Dataset demonstrate that M3SOT achieves state-of-the-art performance at 38 FPS. Our code and models are available at https://github.com/ywu0912/TeamCode.git.
引用
收藏
页码:3630 / 3638
页数:9
相关论文
共 50 条
  • [1] Multi-Sensor Fusion 3D Object Detection Based on Multi-Frame Information
    Wu S.
    Geng J.
    Wu C.
    Yan Z.
    Chen K.
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2023, 43 (12): : 1282 - 1289
  • [2] Boosting Single-Frame 3D Object Detection by Simulating Multi-Frame Point Clouds
    Zheng, Wu
    Jiang, Li
    Lu, FanBin
    Ye, Yangyang
    Fu, Chi-Wing
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4848 - 4856
  • [3] 3D-MAN: 3D Multi-frame Attention Network for Object Detection
    Yang, Zetong
    Zhou, Yin
    Chen, Zhifeng
    Ngiam, Jiquan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1863 - 1872
  • [4] Multi-frame fusion of undersampled 3D imagery
    Cain, Stephen C.
    UNCONVENTIONAL IMAGING AND WAVEFRONT SENSING 2012, 2012, 8520
  • [5] TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object Detection
    Luo, Zhipeng
    Zhang, Gongjie
    Zhou, Changqing
    Liu, Tianrui
    Lu, Shijian
    Pan, Liang
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4219 - 4228
  • [6] 3D Object Detection With Multi-Frame RGB-Lidar Feature Alignment
    Ercelik, Emec
    Yurtsever, Ekim
    Knoll, Alois
    IEEE ACCESS, 2021, 9 : 143138 - 143149
  • [7] EXPRESSIVE 3D FACE SYNTHESIS BY MULTI-SPACE MODELING
    Zhuang, Yueting
    Wang, Yushun
    Xiao, Jun
    Wang, Yujie
    2008 FIRST IEEE INTERNATIONAL CONFERENCE ON UBI-MEDIA COMPUTING AND WORKSHOPS, PROCEEDINGS, 2008, : 207 - 212
  • [8] Visual Object Tracking With Multi-Frame Distractor Suppression
    Han, Yamin
    Cai, Mingyu
    Wu, Jie
    Bai, Zhixuan
    Zhuo, Tao
    Zhang, Hongming
    Zhang, Yanning
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2556 - 2569
  • [9] Single and multi-frame auto-calibration for 3D endoscopy with differential rendering
    Furukawa, Ryo
    Sagawa, Ryusulce
    Oka, Shiro
    Tanaka, Shinji
    Kawasaki, Hiroshi
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,
  • [10] MPPNet: Multi-frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection
    Chen, Xuesong
    Shi, Shaoshuai
    Zhu, Benjin
    Cheung, Ka Chun
    Xu, Hang
    Li, Hongsheng
    COMPUTER VISION, ECCV 2022, PT VIII, 2022, 13668 : 680 - 697