Hidden Two-Stream Convolutional Networks for Action Recognition

被引:84
|
作者
Zhu, Yi [1 ]
Lan, Zhenzhong [2 ]
Newsam, Shawn [1 ]
Hauptmann, Alexander [2 ]
机构
[1] Univ Calif Merced, Merced, CA 95343 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
来源
关键词
Action recognition; Optical flow; Unsupervised learning;
D O I
10.1007/978-3-030-20893-6_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Analyzing videos of human actions involves understanding the temporal relationships among video frames. State-of-the-art action recognition approaches rely on traditional optical flow estimation methods to pre-compute motion information for CNNs. Such a two-stage approach is computationally expensive, storage demanding, and not endto-end trainable. In this paper, we present a novel CNN architecture that implicitly captures motion information between adjacent frames. We name our approach hidden two-stream CNNs because it only takes raw video frames as input and directly predicts action classes without explicitly computing optical flow. Our end-to-end approach is 10x faster than its two-stage baseline. Experimental results on four challenging action recognition datasets: UCF101, HMDB51, THUMOS14 and ActivityNet v1.2 show that our approach significantly outperforms the previous best real-time approaches.
引用
收藏
页码:363 / 378
页数:16
相关论文
共 50 条
  • [1] Two-Stream Convolutional Networks for Action Recognition in Videos
    Simonyan, Karen
    Zisserman, Andrew
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [2] Two-Stream Adaptive Attention Graph Convolutional Networks for Action Recognition
    Du, Qiliang
    Xiang, Zhaoyi
    Tian, Lianfang
    Yu, Lubin
    [J]. Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2022, 50 (12): : 20 - 29
  • [3] Two-stream Flow-guided Convolutional Attention Networks for Action Recognition
    Tran, An
    Cheong, Loong-Fah
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 3110 - 3119
  • [4] Skeleton action recognition using Two-Stream Adaptive Graph Convolutional Networks
    Lee, James
    Kang, Suk-ju
    [J]. 2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,
  • [5] Workflow recognition with structured two-stream convolutional networks
    Hu, Haiyang
    Cheng, Kaiming
    Li, Zhongjin
    Chen, Jie
    Hu, Hua
    [J]. PATTERN RECOGNITION LETTERS, 2020, 130 : 267 - 274
  • [6] Hidden Two-Stream Collaborative Learning Network for Action Recognition
    Zhou, Shuren
    Chen, Le
    Sugumaran, Vijayan
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 63 (03): : 1545 - 1561
  • [7] Two-Stream Convolutional Neural Network for Video Action Recognition
    Qiao, Han
    Liu, Shuang
    Xu, Qingzhen
    Liu, Shouqiang
    Yang, Wanggan
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (10): : 3668 - 3684
  • [8] Convolutional Two-Stream Network Fusion for Video Action Recognition
    Feichtenhofer, Christoph
    Pinz, Axel
    Zisserman, Andrew
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1933 - 1941
  • [9] Two-Stream Temporal Convolutional Networks for Skeleton-Based Human Action Recognition
    Jia, Jin-Gong
    Zhou, Yuan-Feng
    Hao, Xing-Wei
    Li, Feng
    Desrosiers, Christian
    Zhang, Cai-Ming
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (03) : 538 - 550
  • [10] Two-Stream Temporal Convolutional Networks for Skeleton-Based Human Action Recognition
    Jin-Gong Jia
    Yuan-Feng Zhou
    Xing-Wei Hao
    Feng Li
    Christian Desrosiers
    Cai-Ming Zhang
    [J]. Journal of Computer Science and Technology, 2020, 35 : 538 - 550