Residual attention mechanism for visual tracking

被引:0
|
作者
Cheng L. [1 ]
Wang Y. [1 ]
Tian C. [1 ]
机构
[1] School of Electronic Engineering, Xidian University, Xi'an
关键词
Attention mechanism; Convolutional neural network; Object tracking; Residual network;
D O I
10.19665/j.issn1001-2400.2020.06.021
中图分类号
学科分类号
摘要
In recent years, with the development of training data and hardware, a large number of tracking algorithms based on deep learning have been proposed. Compared with the traditional tracking algorithm, tracking algorithms based on deep learning have a great developing potential. However, the traditional convolutional neural network structure cannot effectively perform its powerful feature learning and representation abilities in a tracking task. In this paper, an improved feature extraction network is proposed for video target tracking. Based on the traditional feature extraction network, an attention mechanism and a feature fusion strategy in the form of residual network are introduced. At the same time, a loss function based on the regional overlap rate is introduced in the training stage of the network model, which makes the algorithm produce a better positioning effect. Experimental results show that the improved algorithm can track the target accurately for a long time. Besides, the method has a generalization ability, which can be used for reference for other tracking algorithms based on deep learning. © 2020, The Editorial Board of Journal of Xidian University. All right reserved.
引用
收藏
页码:148 / 157and163
相关论文
共 29 条
  • [21] REZATOFIGHI H, TSOI N, GWAK J, Et al., Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression, Proceedings of the 2019 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 658-666, (2019)
  • [22] HARE S, SAFFARI A, TORR P H S., Struck: Structured Output Tracking with Kernels, Proceedings of the 2011 IEEE International Conference on Computer Vision, pp. 263-270, (2011)
  • [23] SEVILLA LARA L, LEARNED MILLER E., Distribution Fields for Tracking, Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1910-1917, (2012)
  • [24] WANG L J, OUYANG W, WANG X, Et al., STCT: Sequentially Training Convolutional Networks for Visual Tracking, Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1373-1381, (2016)
  • [25] FELSBERG M., Enhanced Distribution Field Tracking Using Channel Representations, Proceedings of the 2013 IEEE International Conference on Computer Vision, pp. 121-128, (2013)
  • [26] POSSEGGER H, MAUTHNER T, BISCHOF H., In Defense of Color-based Model-free Tracking, Proceedings of the 2015 IEEE Computer Society Conference on Conference on Computer Vision and Pattern Recognition, pp. 2113-2120, (2015)
  • [27] LUKEZIC A, CEHOVIN ZAJC L, KRISTAN M., Deformable Parts Correlation Filters for Robust Visual Tracking, IEEE Transactions on Cybernetics, 48, 6, pp. 1849-1861, (2018)
  • [28] AKIN O, ERDEM E, ERDEM A, Et al., Deformable Part-based Tracking by Coupled Global and Local Correlation filters, Journal of Visual Communication and Image Representation, 38, pp. 763-774, (2016)
  • [29] SMEULDERS A W M, CHU D M, CUCCHIARA R, Et al., Visual Tracking: An Experimental Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, 36, 7, pp. 1442-1468, (2014)