Multi-View Latent Variable Discriminative Models For Action Recognition

被引:0
|
作者
Song, Yale [1 ]
Morency, Louis-Philippe [2 ]
Davis, Randall [1 ]
机构
[1] MIT Comp Sci & Artificial Intelligence Lab, Cambridge, MA USA
[2] USC Inst Creat Technol, Los Angeles, CA 90094 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many human action recognition tasks involve data that can be factorized into multiple views such as body postures and hand shapes. These views often interact with each other over time, providing important cues to understanding the action. We present multi-view latent variable discriminative models that jointly learn both view-shared and view-specific sub-structures to capture the interaction between views. Knowledge about the underlying structure of the data is formulated as a multi-chain structured latent conditional model, explicitly learning the interaction between multiple views using disjoint sets of hidden variables in a discriminative manner. The chains are tied using a predetermined topology that repeats over time. We present three topologies - linked, coupled, and linked-coupled - that differ in the type of interaction between views that they model. We evaluate our approach on both segmented and unsegmented human action recognition tasks, using the ArmGesture, the NATOPS, and the ArmGesture-Continuous data. Experimental results show that our approach outperforms previous state-of-the-art action recognition models.
引用
收藏
页码:2120 / 2127
页数:8
相关论文
共 50 条
  • [1] Nonparametric Estimation of Multi-View Latent Variable Models
    Song, Le
    Anandkumar, Animashree
    Dai, Bo
    Xie, Bo
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 640 - 648
  • [2] Discriminative Multi-View Subspace Feature Learning for Action Recognition
    Sheng, Biyun
    Li, Jun
    Xiao, Fu
    Li, Qun
    Yang, Wankou
    Han, Junwei
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (12) : 4591 - 4600
  • [3] Multi-view representation learning for multi-view action recognition
    Hao, Tong
    Wu, Dan
    Wang, Qian
    Sun, Jin-Sheng
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 48 : 453 - 460
  • [4] Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition
    Gao, Z.
    Zhang, H.
    Xu, G. P.
    Xue, Y. B.
    Hauptmann, A. G.
    [J]. SIGNAL PROCESSING, 2015, 112 : 83 - 97
  • [5] Multi-view Anomaly Detection via Robust Probabilistic Latent Variable Models
    Iwata, Tomoharu
    Yamada, Makoto
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [6] Shared Gaussian Process Latent Variable Model for Multi-view Facial Expression Recognition
    Eleftheriadis, Stefanos
    Rudovic, Ognjen
    Pantic, Maja
    [J]. ADVANCES IN VISUAL COMPUTING, ISVC 2013, PT I, 2013, 8033 : 527 - 538
  • [7] Discriminative Multi-View Dynamic Image Fusion for Cross-View 3-D Action Recognition
    Wang, Yancheng
    Xiao, Yang
    Lu, Junyi
    Tan, Bo
    Cao, Zhiguo
    Zhang, Zhenjun
    Zhou, Joey Tianyi
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (10) : 5332 - 5345
  • [8] DVANet: Disentangling View and Action Features for Multi-View Action Recognition
    Siddiqui, Nyle
    Tirupattur, Praveen
    Shah, Mubarak
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4873 - 4881
  • [9] Continuous Multi-View Human Action Recognition
    Wang, Qiang
    Sun, Gan
    Dong, Jiahua
    Wang, Qianqian
    Ding, Zhengming
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3603 - 3614
  • [10] Generative Multi-View Human Action Recognition
    Wang, Lichen
    Ding, Zhengming
    Tao, Zhiqiang
    Liu, Yunyu
    Fu, Yun
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6221 - 6230