MixFormer: End-to-End Tracking with Iterative Mixed Attention

被引:347
|
作者
Cui, Yutao [1 ]
Jiang, Cheng [1 ]
Wang, Limin [1 ]
Wu, Gangshan [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52688.2022.01324
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tracking often uses a multi-stage pipeline of feature extraction, target information integration, and bounding box estimation. To simplify this pipeline and unify the process of feature extraction and target information integration, we present a compact tracking framework, termed as MixFormer, built upon transformers. Our core design is to utilize the flexibility of attention operations, and propose a Mixed Attention Module (MAM) for simultaneous feature extraction and target information integration. This synchronous modeling scheme allows to extract target-specific discriminative features and perform extensive communication between target and search area. Based on MAM, we build our MixFormer tracking framework simply by stacking multiple MAMs with progressive patch embedding and placing a localization head on top. In addition, to handle multiple target templates during online tracking, we devise an asymmetric attention scheme in MAM to reduce computational cost, and propose an effective score prediction module to select high-quality templates. Our MixFormer sets a new state-of-the-art performance on five tracking benchmarks, including LaSOT, TrackingNet, VOT2020, GOT-10k, and UAV123. In particular, our MixFormer-L achieves NP score of 79.9% on LaSOT, 88.9% on TrackingNet and EAO of 0.555 on VOT2020. We also perform in-depth ablation studies to demonstrate the effectiveness of simultaneous feature extraction and information integration. Code and trained models are publicly available at https://github.com/MCG-NJU/MixFormer.
引用
收藏
页码:13598 / 13608
页数:11
相关论文
共 50 条
  • [1] MixFormer: End-to-End Tracking With Iterative Mixed Attention
    Cui, Yutao
    Jiang, Cheng
    Wu, Gangshan
    Wang, Limin
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (06) : 4129 - 4146
  • [2] End-to-End Feature Integration for Correlation Filter Tracking With Channel Attention
    Li, Dongdong
    Wen, Gongjian
    Kuai, Yangliu
    Porikli, Fatih
    IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (12) : 1815 - 1819
  • [3] End-to-end Flow Correlation Tracking with Spatial-temporal Attention
    Zhu, Zheng
    Wu, Wei
    Zou, Wei
    Yan, Junjie
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 548 - 557
  • [4] Tracking Ransomware End-to-end
    Huang, Danny Yuxing
    Aliapoulios, Maxwell Matthaios
    Li, Vector Guo
    Invernizzi, Luca
    McRoberts, Kylie
    Bursztein, Elie
    Levin, Jonathan
    Levchenko, Kirill
    Snoeren, Alex C.
    McCoy, Damon
    2018 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2018, : 618 - 631
  • [5] Attention Flow: End-to-End Joint Attention Estimation
    Sumer, Omer
    Gerjets, Peter
    Trautwein, Ulrich
    Kasneci, Enkelejda
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 3316 - 3325
  • [6] Tracking Counterfeit Cryptocurrency End-to-end
    Gao, Bingyu
    Wang, Haoyu
    Xia, Pengcheng
    Wu, Siwei
    Zhou, Yajin
    Luo, Xiapu
    Tyson, Gareth
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2020, 4 (03)
  • [7] Reinforcement-Tracking: An End-to-End Trajectory Tracking Method Based on Self-Attention Mechanism
    Zhao, Guanglei
    Chen, Zihao
    Liao, Weiming
    INTERNATIONAL JOURNAL OF AUTOMOTIVE TECHNOLOGY, 2024, 25 (03) : 541 - 551
  • [8] Reinforcement-Tracking: An End-to-End Trajectory Tracking Method Based on Self-Attention Mechanism
    Guanglei Zhao
    Zihao Chen
    Weiming Liao
    International Journal of Automotive Technology, 2024, 25 : 541 - 551
  • [9] Attention-Based End-to-End Differentiable Particle Filter for Audio Speaker Tracking
    Zhao, Jinzheng
    Xu, Yong
    Qian, Xinyuan
    Liu, Haohe
    Plumbley, Mark D.
    Wang, Wenwu
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 449 - 458
  • [10] SUPPORTIVE ATTENTION IN END-TO-END MEMORY NETWORKS
    Chien, Jen-Tzung
    Lin, Ting-An
    2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,