Mutually Reinforced Spatio-Temporal Convolutional Tube for Human Action Recognition

被引:0
|
作者
Wu, Haoze [1 ]
Liu, Jiawei [1 ]
Zha, Zheng-Jun [1 ]
Chen, Zhenzhong [2 ]
Sun, Xiaoyan [3 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Brain Inspired Intelligence Technol, Beijing, Peoples R China
[2] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan, Peoples R China
[3] Microsoft Res Asia, Intelligent Multimedia Grp, Beijing, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent works use 3D convolutional neural networks to explore spatio-temporal information for human action recognition. However, they either ignore the correlation between spatial and temporal features or suffer from high computational cost by spatio-temporal features extraction. In this work, we propose a novel and efficient Mutually Reinforced Spatio-Temporal Convolutional Tube (MRST) for human action recognition. It decomposes 3D inputs into spatial and temporal representations, mutually enhances both of them by exploiting the interaction of spatial and temporal information and selectively emphasizes informative spatial appearance and temporal motion, meanwhile reducing the complexity of structure. Moreover, we design three types of MRSTs according to the different order of spatial and temporal information enhancement, each of which contains a spatio-temporal decomposition unit, a mutually reinforced unit and a spatio-temporal fusion unit. An end-to-end deep network, MRST-Net, is also proposed based on the MRSTs to better explore spatiotemporal information in human actions. Extensive experiments show MRST-Net yields the best performance, compared to state-of-the-art approaches.
引用
收藏
页码:968 / 974
页数:7
相关论文
共 50 条
  • [1] Human Action Recognition using Factorized Spatio-Temporal Convolutional Networks
    Sun, Lin
    Jia, Kui
    Yeung, Dit-Yan
    Shi, Bertram E.
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4597 - 4605
  • [2] Exploring hybrid spatio-temporal convolutional networks for human action recognition
    Hao Wang
    Yanhua Yang
    Erkun Yang
    Cheng Deng
    Multimedia Tools and Applications, 2017, 76 : 15065 - 15081
  • [3] Exploring hybrid spatio-temporal convolutional networks for human action recognition
    Wang, Hao
    Yang, Yanhua
    Yang, Erkun
    Deng, Cheng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (13) : 15065 - 15081
  • [4] Spatio-temporal information for human action recognition
    Yao, Li
    Liu, Yunjian
    Huang, Shihui
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2016,
  • [5] Spatio-temporal information for human action recognition
    Li Yao
    Yunjian Liu
    Shihui Huang
    EURASIP Journal on Image and Video Processing, 2016
  • [6] A Spatio-Temporal Convolutional Neural Network for Skeletal Action Recognition
    Hu, Lizhang
    Xu, Jinhua
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 377 - 385
  • [7] Spatio-Temporal Steerable Pyramid for Human Action Recognition
    Zhen, Xiantong
    Shao, Ling
    2013 10TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), 2013,
  • [8] Spatio-temporal Video Autoencoder for Human Action Recognition
    Sousa e Santos, Anderson Carlos
    Pedrini, Helio
    PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 114 - 123
  • [9] Spatio-temporal Semantic Features for Human Action Recognition
    Liu, Jia
    Wang, Xiaonian
    Li, Tianyu
    Yang, Jie
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2012, 6 (10): : 2632 - 2649
  • [10] Human Action Recognition Using Spatio-temporal Classification
    Fang, Chin-Hsien
    Chen, Ju-Chin
    Tseng, Chien-Chung
    Lien, Jenn-Jier James
    COMPUTER VISION - ACCV 2009, PT II, 2010, 5995 : 98 - 109