Survey of video behavior recognition

被引:0
|
作者
Luo H. [1 ]
Wang C. [1 ]
Lu F. [1 ]
机构
[1] School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou
来源
| 2018年 / Editorial Board of Journal on Communications卷 / 39期
基金
中国国家自然科学基金;
关键词
Behavior recognition; Data set; Deep network; Handcrafted;
D O I
10.11959/j.issn.1000-436x.2018107
中图分类号
学科分类号
摘要
Behavior recognition is developing rapidly, and a number of behavior recognition algorithms based on deep network automatic learning features have been proposed. The deep learning method requires a large number of data to train, and requires higher computer storage and computing power. After a brief review of the current popular behavior recognition method based on deep network, it focused on the traditional behavior recognition methods. Traditional behavior recognition methods usually followed the processes of video feature extraction, modeling of features and classification. Following the basic process, the recognition process was overviewed according to the following steps, feature sampling, feature descriptors, feature processing, descriptor aggregation and vector coding. At the same time, the benchmark data set commonly used for evaluating the algorithm performance was also summarized © 2018, Editorial Board of Journal on Communications. All right reserved.
引用
收藏
页码:169 / 180
页数:11
相关论文
共 89 条
  • [31] Jiang Z., Lin Z., Davis L.S., Recognizing human actions by learning and matching shape-motion prototype trees, IEEE Transactions on Pattern Analysis & Machine Intelligence, 34, 3, pp. 533-547, (2012)
  • [32] Huang M., Su S.Z., Cai G.R., Et al., Meta-action descriptor for action recognition in RGBD video, IET Computer Vision, 11, 4, pp. 301-308, (2017)
  • [33] Gorelick L., Blank M., Shechtman E., Et al., Actions as space-time shapes, IEEE Transactions on Pattern Analysis & Machine Intelligence, 29, 12, pp. 2247-2253, (2007)
  • [34] Dalal N., Triggs B., Histograms of oriented gradients for human detection, Computer Vision and Pattern Recognition, pp. 886-893, (2005)
  • [35] Dalal N., Triggs B., Schmid C., Human detection using oriented histograms of flow and appearance, European Conference on Computer Vision, pp. 428-441, (2006)
  • [36] Laptev I., Marszalek M., Schmid C., Et al., Learning realistic human actions from movies, Computer Vision and Pattern Recognition, pp. 1-8, (2008)
  • [37] Peng X., Wang L., Wang X., Et al., Bag of visual words and fusion methods for action recognition: comprehensive study and good practice, Computer Vision & Image Understanding, 150, C, pp. 109-125, (2016)
  • [38] Perronnin F., Mensink T., Improving the fisher kernel for large-scale image classification, European Conference on Computer Vision, pp. 143-156, (2010)
  • [39] Jegou H., Douze M., Schmid C., Et al., Aggregating local descriptors into a compact image representation, Computer Vision and Pattern Recognition, pp. 3304-3311, (2010)
  • [40] Simonyan K., Zisserman A., Two-stream convolutional networks for action recognition in videos, Neural Information Processing Systems, 1, 4, pp. 568-576, (2014)