Double-Stream Convolutional Networks with Sequential Optical Flow Image for Action Recognition

被引:8
|
作者
Li Qinghui [1 ]
Li Aihua [1 ]
Wang Tao [1 ]
Cui Zhigao [1 ]
机构
[1] Rocket Force Engn Univ, Acad Operat Support, Xian 710025, Shaanxi, Peoples R China
关键词
machine vision; action recognition; sequential optical flow image; convolutional neural network; support vector machine;
D O I
10.3788/AOS201838.0615002
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
In order to effectively utilize the long-term temporal information of video for improving the accuracy of action recognition, a new recognition approach is proposed based on the sequential optical flow image and double-stream convolutional neural networks. Firstly, the Rank support vector machine (SVM) algorithm is used to compress the continuous optical flow frames into a single sequential optical flow image to realize the modeling of the long-term temporal structure of video. Secondly, we design a double-stream convolutional networks containing appearance and short-term motion stream and long-term motion stream. It takes the stacked RGB frames and the sequential optical flow images as input to extract the appearance and short-time motion information and the long-time motion information of the video. Finally, the linear SVM is adopted to integrate C3D descriptor and VGG descriptor for action recognition. The experimental results on HMDB51 and UCF101 datasets show that the proposed approach improves the action recognition accuracy effectively by using the spatial information and the temporal motion information.
引用
收藏
页数:7
相关论文
共 20 条
  • [1] Andrew Zisserman, 2015, Arxiv, DOI arXiv:1409.1556
  • [2] [Anonymous], 2017, ACTA OPTICA SINICA, DOI DOI 10.1042/BSR20160257
  • [3] Dynamic Image Networks for Action Recognition
    Bilen, Hakan
    Fernando, Basura
    Gavves, Efstratios
    Vedaldi, Andrea
    Gould, Stephen
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3034 - 3042
  • [4] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [5] Donahue J, 2015, PROC CVPR IEEE, P2625, DOI 10.1109/CVPR.2015.7298878
  • [6] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
  • [7] Feichtenhofer C, 2016, P NEUR INF PROC SYST, P3168
  • [8] Going deeper into action recognition: A survey
    Herath, Samitha
    Harandi, Mehrtash
    Porikli, Fatih
    [J]. IMAGE AND VISION COMPUTING, 2017, 60 : 4 - 21
  • [9] Large-scale Video Classification with Convolutional Neural Networks
    Karpathy, Andrej
    Toderici, George
    Shetty, Sanketh
    Leung, Thomas
    Sukthankar, Rahul
    Fei-Fei, Li
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1725 - 1732
  • [10] [马淼 Ma Miao], 2017, [吉林大学学报. 工学版, Journal of Jilin University. Engineering and Technology Edition], V47, P1244