Unsupervised video-based action recognition using two-stream generative adversarial network

被引:0
|
作者
Wei Lin
Huanqiang Zeng
Jianqing Zhu
Chih-Hsien Hsia
Junhui Hou
Kai-Kuang Ma
机构
[1] Huaqiao University,School of Engineering and School of Information Science and Engineering
[2] Huaqiao University,School of Engineering
[3] Ilan University,Department of Computer Science and Information Engineering
[4] The City University of Hong Kong,Department of Computer Science
[5] Nanyang Technological University,School of Electrical and Electronic Engineering
来源
关键词
Action recognition; Two-stream generative adversarial network; Unsupervised learning;
D O I
暂无
中图分类号
学科分类号
摘要
Video-based action recognition faces many challenges, such as complex and varied dynamic motion, spatio-temporal similar action factors, and manual labeling of archived videos over large datasets. How to extract discriminative spatio-temporal action features in videos with resisting the effect of similar factors in an unsupervised manner is pivotal. For that, this paper proposes an unsupervised video-based action recognition method, called two-stream generative adversarial network (TS-GAN), which comprehensively learns the static texture and dynamic motion information inherited in videos with taking the detail information and global information into account. Specifically, the extraction of the spatio-temporal information in videos is achieved by a two-stream GAN. Considering that proper attention to detail is capable of alleviating the influence of spatio-temporal similar factors to the network, a global-detailed layer is proposed to resist similar factors via fusing intermediate features (i.e., detailed action information) and high-level semantic features (i.e., global action information). It is worthwhile of mentioning that the proposed TS-GAN does not require complex pretext tasks or the construction of positive and negative sample pairs, compared with recent unsupervised video-based action recognition methods. Extensive experiments conducted on the UCF101 and HMDB51 datasets have demonstrated that the proposed TS-GAN is superior to multiple classical and state-of-the-art unsupervised action recognition methods.
引用
收藏
页码:5077 / 5091
页数:14
相关论文
共 50 条
  • [31] Two-Stream Interactive Memory Network for Video Facial Expression Recognition
    Chen, Lingyu
    Ouyang, Yong
    Xu, Ranyi
    Sun, Sisi
    Zeng, Yawen
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT III, 2022, 13531 : 299 - 311
  • [32] Two-Stream Interactive Memory Network for Video Facial Expression Recognition
    Chen, Lingyu
    Ouyang, Yong
    Xu, Ranyi
    Sun, Sisi
    Zeng, Yawen
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, 13531 LNCS : 299 - 311
  • [33] Two-Stream Attention Network for Pain Recognition from Video Sequences
    Thiam, Patrick
    Kestler, Hans A.
    Schwenker, Friedhelm
    SENSORS, 2020, 20 (03)
  • [34] Human action recognition using two-stream attention based LSTM networks
    Dai, Cheng
    Liu, Xingang
    Lai, Jinfeng
    APPLIED SOFT COMPUTING, 2020, 86
  • [35] Spatial-temporal interaction learning based two-stream network for action recognition
    Liu, Tianyu
    Ma, Yujun
    Yang, Wenhan
    Ji, Wanting
    Wang, Ruili
    Jiang, Ping
    INFORMATION SCIENCES, 2022, 606 : 864 - 876
  • [36] An Accurate Device-Free Action Recognition System Using Two-Stream Network
    Sheng, Biyun
    Fang, Yuanrun
    Xiao, Fu
    Sun, Lijuan
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (07) : 7930 - 7939
  • [37] Interactive two-stream graph neural network for skeleton-based action recognition
    Yang, Dun
    Zhou, Qing
    Wen, Ju
    JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (03)
  • [38] Transferable two-stream convolutional neural network for human action recognition
    Xiong, Qianqian
    Zhang, Jianjing
    Wang, Peng
    Liu, Dongdong
    Gao, Robert X.
    JOURNAL OF MANUFACTURING SYSTEMS, 2020, 56 : 605 - 614
  • [39] Two-Stream Action Recognition-Oriented Video Super-Resolution
    Zhang, Haochen
    Liu, Dong
    Xiong, Zhiwei
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8798 - 8807
  • [40] A two-stream conditional generative adversarial network for improving semantic predictions in urban driving scenes
    Lateef, F.
    Kas, M.
    Chahi, A.
    Ruichek, Y.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133