Target-aware transformer tracking with hard occlusion instance generation

被引:0
|
作者
Xiao, Dingkun [1 ]
Wei, Zhenzhong [1 ]
Zhang, Guangjun [1 ]
机构
[1] Beihang Univ, Sch Instrumentat & Optoelect Engn, Key Lab Precis Optomechatron Technol, Minist Educ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
visual tracking; transformer; occlusion; instance generation; target-aware; deep learning; ONLINE OBJECT TRACKING; VISUAL TRACKING;
D O I
10.3389/fnbot.2023.1323188
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual tracking is a crucial task in computer vision that has been applied in diverse fields. Recently, transformer architecture has been widely applied in visual tracking and has become a mainstream framework instead of the Siamese structure. Although transformer-based trackers have demonstrated remarkable accuracy in general circumstances, their performance in occluded scenes remains unsatisfactory. This is primarily due to their inability to recognize incomplete target appearance information when the target is occluded. To address this issue, we propose a novel transformer tracking approach referred to as TATT, which integrates a target-aware transformer network and a hard occlusion instance generation module. The target-aware transformer network utilizes an encoder-decoder structure to facilitate interaction between template and search features, extracting target information in the template feature to enhance the unoccluded parts of the target in the search features. It can directly predict the boundary between the target region and the background to generate tracking results. The hard occlusion instance generation module employs multiple image similarity calculation methods to select an image pitch in video sequences that is most similar to the target and generate an occlusion instance mimicking real scenes without adding an extra network. Experiments on five benchmarks, including LaSOT, TrackingNet, Got10k, OTB100, and UAV123, demonstrate that our tracker achieves promising performance while running at approximately 41 fps on GPU. Specifically, our tracker achieves the highest AUC scores of 65.5 and 61.2% in partial and full occlusion evaluations on LaSOT, respectively.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Target-Aware Transformer Tracking
    Zheng, Yuhui
    Zhang, Yan
    Xiao, Bin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4542 - 4551
  • [2] TATrack: Target-aware transformer for object tracking
    Huang, Kai
    Chu, Jun
    Leng, Lu
    Dong, Xingbo
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [3] Target-Aware Transformer for Satellite Video Object Tracking
    Lai, Pujian
    Zhang, Meili
    Cheng, Gong
    Li, Shengyang
    Huang, Xiankai
    Han, Junwei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 10
  • [4] Target-Aware Deep Tracking
    Li, Xin
    Ma, Chao
    Wu, Baoyuan
    He, Zhenyu
    Yang, Ming-Hsuan
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1369 - 1378
  • [5] Know Who You Are: Learning Target-Aware Transformer for Object Tracking
    Zou, Zhuojun
    Liu, Xuexin
    Zhang, Yuanpei
    Shu, Lin
    Hao, Jie
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1427 - 1432
  • [6] Knowledge Distillation via the Target-aware Transformer
    Lin, Sihao
    Xie, Hongwei
    Wang, Bing
    Yu, Kaicheng
    Chang, Xiaojun
    Liang, Xiaodan
    Wang, Gang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10905 - 10914
  • [7] Target-Aware Molecular Graph Generation
    Tan, Cheng
    Gao, Zhangyang
    Li, Stan Z.
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2023, PT VI, 2023, 14174 : 410 - 427
  • [8] Target-Aware State Estimation for Visual Tracking
    Zhou, Zikun
    Li, Xin
    Fan, Nana
    Wang, Hongpeng
    He, Zhenyu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 2908 - 2920
  • [9] Learning target-aware correlation filters for visual tracking
    Li, Dongdong
    Wen, Gongjian
    Kuai, Yangliu
    Xiao, Jingjing
    Porikli, Fatih
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 58 : 149 - 159
  • [10] Structural target-aware model for thermal infrared tracking
    Yuan, Di
    Shu, Xiu
    Liu, Qiao
    He, Zhenyu
    NEUROCOMPUTING, 2022, 491 : 44 - 56