Adaptive Multi-Scale Transformer Tracker for Satellite Videos

被引:0
|
作者
Zhang, Xin [1 ]
Jiao, Licheng [1 ]
Li, Lingling [1 ]
Liu, Xu [1 ]
Liu, Fang [1 ]
Ma, Wenping [1 ]
Yang, Shuyuan [1 ]
机构
[1] Xidian Univ, Int Res Ctr Intelligent Percept & Computat, Sch Artificial Intelligence,Minist Educ China, Key Lab Intelligent Percept & Image Understanding,, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Transformers; Satellites; Target tracking; Videos; Video tracking; Computational modeling; Adaptive Transformer; multi-scale Transformer (MT); object regression; satellite video tracking; OBJECT TRACKING;
D O I
10.1109/TGRS.2024.3441038
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Satellite video tracking tasks are often characterized by blurred foreground boundaries in vast scenes, a wide range of targets varying in scale, and irregular changes in appearance. These challenges significantly impact the optimization of robust tracker performance. Therefore, it is imperative to extract diverse features with dynamic adaptive learning capabilities for the target being tracked in each sequence. In this article, we explore a novel adaptive multi-scale Transformer (MT) tracker for satellite videos to explore the potential spatiotemporal information of the target effectively. Specifically, a multi-scale spatial Transformer (MSST) is designed to leverage stage-by-stage spatial reduction and channel doubling, thereby enhancing the representation capabilities for the tracked target. In dynamic feature learning, an adaptive temporal Transformer (ATT) is then introduced based on multiple cross attentions, which analyzes the adaptive learning capacity for the dynamic target. It analyzes the weight proportion of different attentions automatically in the specific sequence through the learnable parameters. Finally, a multi-scale feature (MSF) regression module is crafted to improve the positioning accuracy of targets with low pixel counts in satellite scenes. This module accomplishes precise annotation of target boxes by effectively fusing features from diverse stages. We evaluate the proposed tracker performance on several public satellite datasets, including SatSOT, SV248S, and VISO. Experimental results show that the performance of our model can be comparable to the state-of-the-art trackers.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Multi-Scale Temporal Transformer For Speech Emotion Recognition
    Li, Zhipeng
    Xing, Xiaofen
    Fang, Yuanbo
    Zhang, Weibin
    Fan, Hengsheng
    Xu, Xiangmin
    INTERSPEECH 2023, 2023, : 3652 - 3656
  • [32] Multi-scale Transformer with Decoder for Image Quality Assessment
    Zhang, Shuai
    Liu, Yutao
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT I, 2024, 14473 : 220 - 231
  • [33] MULTI-SCALE BACKGROUND SUPPRESSION ANOMALY DETECTION IN SURVEILLANCE VIDEOS
    Zhen, Yang
    Guo, Yuanfang
    Wei, Jinjie
    Bao, Xiuguo
    Huang, Di
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1114 - 1118
  • [34] Multi-scale and multi-patch transformer for sandstorm image enhancement
    Liang, Pengwei
    Ding, Wenyu
    Fan, Lu
    Wang, Haoyu
    Li, Zihong
    Yang, Fan
    Wang, Bo
    Li, Chongyi
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 89
  • [35] AMSFormer: A transformer with adaptive multi-scale partitioning and multi-level spectral filtering for time-series forecasting
    Liu, Honghao
    Diao, Yining
    Sun, Ke
    Wan, Zhaolin
    Li, Zhiyang
    NEUROCOMPUTING, 2025, 637
  • [36] Multi-scale Adaptive Threshold for DDoS Detection
    Ouerfelli, Fatima Ezzahra
    Barbaria, Khaled
    Zouari, Belhassen
    Fachkha, Claude
    RISKS AND SECURITY OF INTERNET AND SYSTEMS (CRISIS 2019), 2020, 12026 : 342 - 354
  • [37] A multi-scale adaptive Grey World algorithm
    Li, Bing
    Xu, De
    Lee, Moon Ho
    Feng, Song-He
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (07) : 1121 - 1124
  • [38] Adaptive Multi-Scale Detection of Acoustic Events
    Ding, Wenhao
    He, Liang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 294 - 306
  • [39] Multi-scale adaptive networks for efficient inference
    Li, Linfeng
    Su, Weixing
    Liu, Fang
    He, Maowei
    Liang, Xiaodan
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (02) : 267 - 282
  • [40] Multi-scale Adaptive Computational Ghost Imaging
    Shuai Sun
    Wei-Tao Liu
    Hui-Zu Lin
    Er-Feng Zhang
    Ji-Ying Liu
    Quan Li
    Ping-Xing Chen
    Scientific Reports, 6