Multi-scale feature extraction and fusion with attention interaction for RGB-T

被引:0
|
作者
Xing, Haijiao [1 ]
Wei, Wei [1 ]
Zhang, Lei [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Single-object tracking; RGB-T tracking; Feature fusion; SIAMESE NETWORKS; TRACKING;
D O I
10.1016/j.patcog.2024.110917
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-T single-object tracking aims to track objects utilizing both RGB images and thermal infrared(TIR) images. Though the siamese-based RGB-T tracker shows its advantage in tracking speed, its accuracy still cannot be compared with other state-of-the-art trackers (e.g., MDNet). In this study, we revisit the existing siamese-based RGB-T tracker and find that such fall behind comes from insufficient feature fusion between RGB image and TIR image, as well as incomplete interactions between template frame and search frame. Inspired by this, we propose a multi-scale feature extraction and fusion network with Temporal-Spatial Memory (MFATrack). Instead of fusing RGB image and TIR image with the single-scale feature map or only high-level features from the multi-scale feature map, MFATrack proposes a new fusion strategy by fusing features from all scales, which can capture contextual information in shallow layers and details in the deep layer. To learn the feature better for tracking tasks, MFATrack fuses the features via several consecutive frames. In addition, we also propose a self-attention interaction module specifically designed for the search frame, highlighting the features in the search frame that are relevant to the target and thus facilitating rapid convergence for target localization. Experimental results demonstrate the proposed MFATrack is not only fast, but also can obtain better tracking accuracy compared with other competing methods including MDNet-based methods and other siamese-based trackers.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Binocular Depth Estimation Algorithm Based on Multi-Scale Attention Feature Fusion
    Yang Huitong
    Lei Lang
    Lin Yongchun
    LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (18)
  • [42] Fusion of Geometric Attention and Multi-Scale Feature Network for Point Cloud Registration
    Du, Jiajin
    Bai, Zhengyao
    Liu, Xuheng
    Li, Zekai
    Xiao, Xiao
    You, Yilin
    Computer Engineering and Applications, 2024, 60 (12) : 234 - 244
  • [43] Multi-scale feature fusion with attention mechanism for crowded road object detection
    Wu, Jingtao
    Dai, Guojun
    Zhou, Wenhui
    Zhu, Xudong
    Wang, Zengguan
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (02)
  • [44] Multi-scale feature fusion with attention mechanism for crowded road object detection
    Jingtao Wu
    Guojun Dai
    Wenhui Zhou
    Xudong Zhu
    Zengguan Wang
    Journal of Real-Time Image Processing, 2024, 21
  • [45] MSFFA: a multi-scale feature fusion and attention mechanism network for crowd counting
    Zhaoxin Li
    Shuhua Lu
    Yishan Dong
    Jingyuan Guo
    The Visual Computer, 2023, 39 : 1045 - 1056
  • [46] Multi-Scale Feature Fusion Attention Network for Infrared Small Target Detection
    Zhang, Yidan
    Li, Chunlei
    Liu, Yundong
    Liu, Zhoufeng
    Yang, Ruimin
    FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
  • [47] Residual attention mechanism and weighted feature fusion for multi-scale object detection
    Zhang, Jie
    Qi, Qiye
    Zhang, Huanlong
    Du, Qifan
    Wang, Fengxian
    Shi, Xiaoping
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (26) : 40873 - 40889
  • [48] A multi-scale feature fusion spatial–channel attention model for background subtraction
    Yizhong Yang
    Tingting Xia
    Dajin Li
    Zhang Zhang
    Guangjun Xie
    Multimedia Systems, 2023, 29 : 3609 - 3623
  • [49] Residual attention mechanism and weighted feature fusion for multi-scale object detection
    Jie Zhang
    Qiye Qi
    Huanlong Zhang
    Qifan Du
    Fengxian Wang
    Xiaoping Shi
    Multimedia Tools and Applications, 2023, 82 : 40873 - 40889
  • [50] Attention Enhanced Multi-Scale Feature Map Fusion Few Shot Learning
    Feng, Xiaopeng
    Han, Liang
    Tao, Pin
    Jiang, Yusheng
    2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 352 - 356