Multimodal Multipart Learning for Action Recognition in Depth Videos

被引:76
|
作者
Shahroudy, Amir [1 ,2 ]
Ng, Tian-Tsong [2 ]
Yang, Qingxiong [3 ]
Wang, Gang [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Inst Infocomm Res, 1 Fusionopolis Way, Singapore 138632, Singapore
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
基金
新加坡国家研究基金会;
关键词
Action recognition; kinect; joint sparse regression; mixed norms; structured sparsity; group feature selection; MULTITASK; FEATURES; SELECTION; TRACKING; SPARSITY; MODEL;
D O I
10.1109/TPAMI.2015.2505295
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The articulated and complex nature of human actions makes the task of action recognition difficult. One approach to handle this complexity is dividing it to the kinetics of body parts and analyzing the actions based on these partial descriptors. We propose a joint sparse regression based learning method which utilizes the structured sparsity to model each action as a combination of multimodal features from a sparse set of body parts. To represent dynamics and appearance of parts, we employ a heterogeneous set of depth and skeleton based features. The proper structure of multimodal multipart features are formulated into the learning framework via the proposed hierarchical mixed norm, to regularize the structured features of each part and to apply sparsity between them, in favor of a group feature selection. Our experimental results expose the effectiveness of the proposed learning method in which it outperforms other methods in all three tested datasets while saturating one of them by achieving perfect accuracy.
引用
收藏
页码:2123 / 2129
页数:7
相关论文
共 50 条
  • [1] Learning Action Recognition Model From Depth and Skeleton Videos.
    Rahmani, Hossein
    Bennamoun, Mohammed
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5833 - 5842
  • [2] Temporal cues enhanced multimodal learning for action recognition in RGB-D videos
    Liu, Dan
    Meng, Fanrong
    Xia, Qing
    Ma, Zhiyuan
    Mi, Jinpeng
    Gan, Yan
    Ye, Mao
    Zhang, Jianwei
    NEUROCOMPUTING, 2024, 594
  • [3] Structured Learning for Action Recognition in Videos
    Long, Yinghan
    Srinivasan, Gopalakrishnan
    Panda, Priyadarshini
    Roy, Kaushik
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (03) : 475 - 484
  • [4] DAAL: Deep activation-based attribute learning for action recognition in depth videos
    Zhang, Chenyang
    Tian, Yingli
    Guo, Xiaojie
    Liu, Jingen
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2018, 167 : 37 - 49
  • [5] Learning correlations for human action recognition in videos
    Yi, Yun
    Wang, Hanli
    Zhang, Bowen
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (18) : 18891 - 18913
  • [6] Learning correlations for human action recognition in videos
    Yun Yi
    Hanli Wang
    Bowen Zhang
    Multimedia Tools and Applications, 2017, 76 : 18891 - 18913
  • [7] Action recognition in depth videos using hierarchical gaussian descriptor
    Xuan Son Nguyen
    Abdel-Illah Mouaddib
    Thanh Phuong Nguyen
    Laurent Jeanpierre
    Multimedia Tools and Applications, 2018, 77 : 21617 - 21652
  • [8] Action recognition in depth videos using hierarchical gaussian descriptor
    Nguyen, Xuan Son
    Mouaddib, Abdel-Illah
    Thanh Phuong Nguyen
    Jeanpierre, Laurent
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (16) : 21617 - 21652
  • [9] Class-Incremental Learning for Action Recognition in Videos
    Park, Jaeyoo
    Kang, Minsoo
    Han, Bohyung
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13678 - 13687
  • [10] TRANSFER LEARNING FOR VIDEOS: FROM ACTION RECOGNITION TO SIGN LANGUAGE RECOGNITION
    Sarhan, Noha
    Frintrop, Simone
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1811 - 1815