Unsupervised video-based action recognition using two-stream generative adversarial network

被引:0
|
作者
Wei Lin
Huanqiang Zeng
Jianqing Zhu
Chih-Hsien Hsia
Junhui Hou
Kai-Kuang Ma
机构
[1] Huaqiao University,School of Engineering and School of Information Science and Engineering
[2] Huaqiao University,School of Engineering
[3] Ilan University,Department of Computer Science and Information Engineering
[4] The City University of Hong Kong,Department of Computer Science
[5] Nanyang Technological University,School of Electrical and Electronic Engineering
来源
关键词
Action recognition; Two-stream generative adversarial network; Unsupervised learning;
D O I
暂无
中图分类号
学科分类号
摘要
Video-based action recognition faces many challenges, such as complex and varied dynamic motion, spatio-temporal similar action factors, and manual labeling of archived videos over large datasets. How to extract discriminative spatio-temporal action features in videos with resisting the effect of similar factors in an unsupervised manner is pivotal. For that, this paper proposes an unsupervised video-based action recognition method, called two-stream generative adversarial network (TS-GAN), which comprehensively learns the static texture and dynamic motion information inherited in videos with taking the detail information and global information into account. Specifically, the extraction of the spatio-temporal information in videos is achieved by a two-stream GAN. Considering that proper attention to detail is capable of alleviating the influence of spatio-temporal similar factors to the network, a global-detailed layer is proposed to resist similar factors via fusing intermediate features (i.e., detailed action information) and high-level semantic features (i.e., global action information). It is worthwhile of mentioning that the proposed TS-GAN does not require complex pretext tasks or the construction of positive and negative sample pairs, compared with recent unsupervised video-based action recognition methods. Extensive experiments conducted on the UCF101 and HMDB51 datasets have demonstrated that the proposed TS-GAN is superior to multiple classical and state-of-the-art unsupervised action recognition methods.
引用
收藏
页码:5077 / 5091
页数:14
相关论文
共 50 条
  • [1] Unsupervised video-based action recognition using two-stream generative adversarial network
    Lin, Wei
    Zeng, Huanqiang
    Zhu, Jianqing
    Hsia, Chih-Hsien
    Hou, Junhui
    Ma, Kai-Kuang
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (09): : 5077 - 5091
  • [2] Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network
    Liyao Song
    Quan Wang
    Haiwei Li
    Jiancun Fan
    Bingliang Hu
    Neural Processing Letters, 2021, 53 : 2701 - 2714
  • [3] Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network
    Song, Liyao
    Wang, Quan
    Lie, Haiwei
    Fan, Jiancun
    Hu, Bingliang
    NEURAL PROCESSING LETTERS, 2021, 53 (04) : 2701 - 2714
  • [4] Convolutional Two-Stream Network Fusion for Video Action Recognition
    Feichtenhofer, Christoph
    Pinz, Axel
    Zisserman, Andrew
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1933 - 1941
  • [5] Two-Stream Convolutional Neural Network for Video Action Recognition
    Qiao, Han
    Liu, Shuang
    Xu, Qingzhen
    Liu, Shouqiang
    Yang, Wanggan
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (10): : 3668 - 3684
  • [6] Two-Stream Convolution Neural Network with Video-stream for Action Recognition
    Dai, Wei
    Chen, Yimin
    Huang, Chen
    Gao, Ming-Ke
    Zhang, Xinyu
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [7] TBRNet: Two-Stream BiLSTM Residual Network for Video Action Recognition
    Wu, Xiao
    Ji, Qingge
    ALGORITHMS, 2020, 13 (07) : 1 - 21
  • [8] Spatiotemporal two-stream LSTM network for unsupervised video summarization
    Min Hu
    Ruimin Hu
    Zhongyuan Wang
    Zixiang Xiong
    Rui Zhong
    Multimedia Tools and Applications, 2022, 81 : 40489 - 40510
  • [9] Spatiotemporal two-stream LSTM network for unsupervised video summarization
    Hu, Min
    Hu, Ruimin
    Wang, Zhongyuan
    Xiong, Zixiang
    Zhong, Rui
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (28) : 40489 - 40510
  • [10] Two-Stream Multirate Recurrent Neural Network for Video-Based Pedestrian Reidentification
    Zeng, Zhiqiang
    Li, Zhihui
    Cheng, De
    Zhang, Huaxiang
    Zhan, Kun
    Yang, Yi
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2018, 14 (07) : 3179 - 3186