Deeper Siamese network with multi-level feature fusion for real-time visual tracking

被引:11
|
作者
Yang, Kang [1 ,2 ,3 ]
Song, Huihui [1 ,2 ,3 ]
Zhang, Kaihua [1 ,2 ,3 ]
Fan, Jiaqing [1 ,2 ,3 ]
机构
[1] Nanjing Univ Informat Sci & Technol, B DAT, Nanjing, Jiangsu, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, B DAT, Nanjing, Jiangsu, Peoples R China
[3] Nanjing Univ Informat Sci & Technol, CICAEET, Nanjing, Jiangsu, Peoples R China
关键词
feature extraction; image fusion; object tracking; image representation; neural nets; real-time systems; image resolution; deeper Siamese network; multilevel feature fusion; real-time visual tracking; SiamN-based trackers; target representation; tracking performance degeneration; deeper ResNet; feature map resolution; feature agglomeration module; visual object tracking;
D O I
10.1049/el.2019.1041
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, using Siamese network (SiamN) for visual tracking has witnessed a great success in terms of accuracy and efficiency. Nevertheless, most SiamN-based trackers employ shallow network such as AlexNet to extract the top-layer features as target representation that are less discriminative, usually leading to tracking performance degeneration when suffering from large deformation and similar distractors. A straightforward idea to address this issue is to replace the backbone network of SiamN with deeper ResNet. However, this cannot boost performance much due to the low resolution of high-level feature maps with useful spatial details losing. To address this issue, the authors propose a lightweight yet effective feature agglomeration module (FAM) to adaptively fuse low-level and high-level features for robust tracking. Specifically, they first develop a generalised non-local attention module to enhance the discriminative capability of high-level semantic features. Then, they design an inception-like module to enhance the representative power of low-level features with more spatial details. Both types of features are then adaptively fused in the FAM to complement their characteristics. Extensive evaluations on OTB-2015 and VOT2017 challenge demonstrate that the proposed tracker consistently achieves favourable performance against several state-of-the-art trackers and runs at 50 fps.
引用
收藏
页码:742 / 744
页数:3
相关论文
共 50 条
  • [1] Multi-level prediction Siamese network for real-time UAV visual tracking
    Zhu, Mu
    Zhang, Hui
    Zhang, Jing
    Zhuo, Li
    [J]. IMAGE AND VISION COMPUTING, 2020, 103
  • [2] Multi-feature fusion Siamese Network for Real-Time Object Tracking
    Zhou, Lijun
    Li, Hongyun
    Zhang, Jianlin
    [J]. PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 478 - 481
  • [3] A Multi-level Feature Fusion Network for Real-time Semantic Segmentation
    Wang, Lu
    Xu, Qinzhen
    Xiong, Zixiang
    Huang, Yongming
    Yang, Luxi
    [J]. 2019 11TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2019,
  • [4] Deeper and Wider Siamese Networks for Real-Time Visual Tracking
    Zhang, Zhipeng
    Peng, Houwen
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4586 - 4595
  • [5] Hierarchical Siamese network for real-time visual tracking
    Li, Xiaojing
    Wei, Guanqun
    Jiang, Mingjian
    Zhou, Wei
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [6] A Real-Time Object Tracking model based on Deeper Siamese Network
    Zou, Qijie
    Zhang, Yue
    Liu, Shihui
    Yu, Jing
    [J]. PROCEEDINGS OF 2020 3RD INTERNATIONAL CONFERENCE ON UNMANNED SYSTEMS (ICUS), 2020, : 1089 - 1094
  • [7] Structured Siamese Network for Real-Time Visual Tracking
    Zhang, Yunhua
    Wang, Lijun
    Qi, Jinqing
    Wang, Dong
    Feng, Mengyang
    Lu, Huchuan
    [J]. COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 : 355 - 370
  • [8] Deeper Siamese Network With Stronger Feature Representation for Visual Tracking
    Zhang, Chaoyi
    Wang, Howard
    Wen, Jiwei
    Peng, Li
    [J]. IEEE ACCESS, 2020, 8 : 119094 - 119104
  • [9] A Learning Frequency-Aware Feature Siamese Network for Real-Time Visual Tracking
    Yang, Yuxiang
    Xing, Weiwei
    Zhang, Shunli
    Yu, Qi
    Guo, Xiaoyu
    Guo, Min
    [J]. ELECTRONICS, 2020, 9 (05):
  • [10] Combined Kalman Filter and Multifeature Fusion Siamese Network for Real-Time Visual Tracking
    Zhou, Lijun
    Zhang, Jianlin
    [J]. SENSORS, 2019, 19 (09)