Hidden Two-Stream Convolutional Networks for Action Recognition

被引:84
|
作者
Zhu, Yi [1 ]
Lan, Zhenzhong [2 ]
Newsam, Shawn [1 ]
Hauptmann, Alexander [2 ]
机构
[1] Univ Calif Merced, Merced, CA 95343 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
来源
关键词
Action recognition; Optical flow; Unsupervised learning;
D O I
10.1007/978-3-030-20893-6_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Analyzing videos of human actions involves understanding the temporal relationships among video frames. State-of-the-art action recognition approaches rely on traditional optical flow estimation methods to pre-compute motion information for CNNs. Such a two-stage approach is computationally expensive, storage demanding, and not endto-end trainable. In this paper, we present a novel CNN architecture that implicitly captures motion information between adjacent frames. We name our approach hidden two-stream CNNs because it only takes raw video frames as input and directly predicts action classes without explicitly computing optical flow. Our end-to-end approach is 10x faster than its two-stage baseline. Experimental results on four challenging action recognition datasets: UCF101, HMDB51, THUMOS14 and ActivityNet v1.2 show that our approach significantly outperforms the previous best real-time approaches.
引用
收藏
页码:363 / 378
页数:16
相关论文
共 50 条
  • [31] 2s-GATCN: Two-Stream Graph Attentional Convolutional Networks for Skeleton-Based Action Recognition
    Zhou, Shu-Bo
    Chen, Ran-Ran
    Jiang, Xue-Qin
    Pan, Feng
    [J]. ELECTRONICS, 2023, 12 (07)
  • [32] Two-Stream Convolutional Networks for Hyperspectral Target Detection
    Zhu, Dehui
    Du, Bo
    Zhang, Liangpei
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (08): : 6907 - 6921
  • [33] Two-Stream Convolutional Networks for Dynamic Texture Synthesis
    Tesfaldet, Matthew
    Brubaker, Marcus A.
    Derpanis, Konstantinos G.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6703 - 6712
  • [34] A Facial Expression Recognition Method Using Two-Stream Convolutional Networks in Natural Scenes
    Zhao, Lixin
    [J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2021, 17 (02): : 399 - 410
  • [35] Two-stream convolutional networks for skin cancer classification
    Mohammed Aloraini
    [J]. Multimedia Tools and Applications, 2024, 83 : 30741 - 30753
  • [36] Toward Efficient Action Recognition: Principal Backpropagation for Training Two-Stream Networks
    Huang, Wenbing
    Fan, Lijie
    Harandi, Mehrtash
    Ma, Lin
    Liu, Huaping
    Liu, Wei
    Gan, Chuang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1773 - 1782
  • [37] Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling
    Sarabu, Ashok
    Santra, Ajit Kumar
    [J]. DATA, 2020, 5 (04) : 1 - 12
  • [38] Improved two-stream model for human action recognition
    Zhao, Yuxuan
    Man, Ka Lok
    Smith, Jeremy
    Siddique, Kamran
    Guan, Sheng-Uei
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2020, 2020 (01)
  • [39] Two-stream Deep Representation for Human Action Recognition
    Ghrab, Najla Bouarada
    Fendri, Emna
    Hammami, Mohamed
    [J]. FOURTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2021), 2022, 12084
  • [40] An Improved Two-stream 3D Convolutional Neural Network for Human Action Recognition
    Chen, Jun
    Xu, Yuanping
    Zhang, Chaolong
    Xu, Zhijie
    Meng, Xiangxiang
    Wang, Jie
    [J]. 2019 25TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING (ICAC), 2019, : 135 - 140