Deeper Siamese network with multi-level feature fusion for real-time visual tracking

被引:11
|
作者
Yang, Kang [1 ,2 ,3 ]
Song, Huihui [1 ,2 ,3 ]
Zhang, Kaihua [1 ,2 ,3 ]
Fan, Jiaqing [1 ,2 ,3 ]
机构
[1] Nanjing Univ Informat Sci & Technol, B DAT, Nanjing, Jiangsu, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, B DAT, Nanjing, Jiangsu, Peoples R China
[3] Nanjing Univ Informat Sci & Technol, CICAEET, Nanjing, Jiangsu, Peoples R China
关键词
feature extraction; image fusion; object tracking; image representation; neural nets; real-time systems; image resolution; deeper Siamese network; multilevel feature fusion; real-time visual tracking; SiamN-based trackers; target representation; tracking performance degeneration; deeper ResNet; feature map resolution; feature agglomeration module; visual object tracking;
D O I
10.1049/el.2019.1041
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, using Siamese network (SiamN) for visual tracking has witnessed a great success in terms of accuracy and efficiency. Nevertheless, most SiamN-based trackers employ shallow network such as AlexNet to extract the top-layer features as target representation that are less discriminative, usually leading to tracking performance degeneration when suffering from large deformation and similar distractors. A straightforward idea to address this issue is to replace the backbone network of SiamN with deeper ResNet. However, this cannot boost performance much due to the low resolution of high-level feature maps with useful spatial details losing. To address this issue, the authors propose a lightweight yet effective feature agglomeration module (FAM) to adaptively fuse low-level and high-level features for robust tracking. Specifically, they first develop a generalised non-local attention module to enhance the discriminative capability of high-level semantic features. Then, they design an inception-like module to enhance the representative power of low-level features with more spatial details. Both types of features are then adaptively fused in the FAM to complement their characteristics. Extensive evaluations on OTB-2015 and VOT2017 challenge demonstrate that the proposed tracker consistently achieves favourable performance against several state-of-the-art trackers and runs at 50 fps.
引用
收藏
页码:742 / 744
页数:3
相关论文
共 50 条
  • [21] SiamDA: Dual attention Siamese network for real-time visual tracking
    Pu, Lei
    Feng, Xinxi
    Hou, Zhiqiang
    Yu, Wangsheng
    Zha, Yufei
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 95
  • [22] An IoU-aware Siamese network for real-time visual tracking
    Wei, Bingbing
    Chen, Hongyu
    Cao, Siqi
    Ding, Qinghai
    Luo, Haibo
    [J]. NEUROCOMPUTING, 2023, 527 : 13 - 26
  • [23] Siamese Centerness Prediction Network for Real-Time Visual Object Tracking
    Wu, Yue
    Cai, Chengtao
    Yeo, Chai Kiat
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (02) : 1029 - 1044
  • [24] Siamese Centerness Prediction Network for Real-Time Visual Object Tracking
    Yue Wu
    Chengtao Cai
    Chai Kiat Yeo
    [J]. Neural Processing Letters, 2023, 55 : 1029 - 1044
  • [25] MFENet: Multi-level feature enhancement network for real-time semantic segmentation
    Zhang, Boxiang
    Li, Wenhui
    Hui, Yuming
    Liu, Jiayun
    Guan, Yuanyuan
    [J]. NEUROCOMPUTING, 2020, 393 : 54 - 65
  • [26] MLVSNet: Multi-level Voting Siamese Network for 3D Visual Tracking
    Wang, Zhoutao
    Xie, Qian
    Lai, Yu-Kun
    Wu, Jing
    Long, Kun
    Wang, Jun
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3081 - 3090
  • [27] Siamese Deformable Cross-Correlation Network for Real-Time Visual Tracking
    Zheng, Linyu
    Chen, Yingying
    Tang, Ming
    Wang, Jinqiao
    Lu, Hanqing
    [J]. NEUROCOMPUTING, 2020, 401 : 36 - 47
  • [28] Siamese target estimation network with AIoU loss for real-time visual tracking
    Li, Zhiyong
    Hu, Chenming
    Nai, Ke
    Yuan, Jin
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 77
  • [29] Real-Time Tracking and Positioning of Underwater Visual Targets Based on Siamese Network
    Zhang, Wenbo
    Liu, Weidong
    Li, Le
    Jiao, Huifeng
    Li, Yanli
    Li, Linfeng
    [J]. PROCEEDINGS OF 2022 INTERNATIONAL CONFERENCE ON AUTONOMOUS UNMANNED SYSTEMS, ICAUS 2022, 2023, 1010 : 1356 - 1367
  • [30] MLFNet: Multi-Level Fusion Network for Real-Time Semantic Segmentation of Autonomous Driving
    Fan, Jiaqi
    Wang, Fei
    Chu, Hongqing
    Hu, Xiao
    Cheng, Yifan
    Gao, Bingzhao
    [J]. IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (01): : 756 - 767