Structured Images for RGB-D Action Recognition

被引:49
|
作者
Wang, Pichao [1 ]
Wang, Shuang [2 ]
Gao, Zhimin [1 ]
Hou, Yonghong [2 ]
Li, Wanqing [1 ]
机构
[1] Univ Wollongong, Adv Multimedia Res Lab, Wollongong, NSW, Australia
[2] Tianjin Univ, Sch Elect Informat Engn, Tianjin, Peoples R China
关键词
D O I
10.1109/ICCVW.2017.123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous works that applied one Convolutional Neural Network (ConvNet) for each part/joint separately, one pair of structured dynamic images is constructed from depth maps at each granularity level and serves as the input of a ConvNet. The structured dynamic image not only preserves the spatial-temporal information but also enhances the structure information across both body parts/joints and different temporal scales. In addition, it requires low computational cost and memory to construct. This new representation, referred to as Spatially Structured Dynamic Depth Images ((SDDI)-D-2), aggregates from global to fine-grained levels motion and structure information in a depth sequence, and enables us to fine-tune the existing ConvNet models trained on image data for classification of depth sequences, without a need for training the models afresh. The proposed representation is evaluated on five benchmark datasets, namely, MSRAction3D, G3D, MSRDailyActivity3D, SYSU 3D HOI and UTD-MHAD datasets and achieves the state-of-the-art results on all five datasets.
引用
收藏
页码:1005 / 1014
页数:10
相关论文
共 50 条
  • [31] Action Tube Extraction based 3D-CNN for RGB-D Action Recognition
    Xu, Zineng
    Vilaplana, Veronica
    Ramon Morros, Josep
    [J]. 2018 16TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2018,
  • [32] Domain adaptation from RGB-D to RGB images
    Li, Xiao
    Fang, Min
    Zhang, Ju-Jie
    Wu, Jinqiao
    [J]. SIGNAL PROCESSING, 2017, 131 : 27 - 35
  • [33] Structure-Preserving Binary Representations for RGB-D Action Recognition
    Yu, Mengyang
    Liu, Li
    Shao, Ling
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (08) : 1651 - 1664
  • [34] Human Action Recognition with Contextual Constraints using a RGB-D Sensor
    Gu, Ye
    Sheng, Weihua
    Ou, Yongsheng
    Liu, Meiqin
    Zhang, Senlin
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2013, : 674 - 679
  • [35] Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition
    Wang, Pichao
    Li, Wanqing
    Wan, Jun
    Ogunbona, Philip
    Liu, Xinwang
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7404 - 7411
  • [36] Unsupervised Segmentation of RGB-D Images
    Deng, Zhuo
    Latecki, Longin Jan
    [J]. COMPUTER VISION - ACCV 2014, PT III, 2015, 9005 : 423 - 435
  • [37] Incremental Registration of RGB-D Images
    Dryanovski, Ivan
    Jaramillo, Carlos
    Xiao, Jizhong
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2012, : 1685 - 1690
  • [38] Linear RGB-D SLAM for Structured Environments
    Joo, Kyungdon
    Kim, Pyojin
    Hebert, Martial
    Kweon, In So
    Kim, Hyoun Jin
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) : 8403 - 8419
  • [39] Pseudo RGB-D Face Recognition
    Jin, Bo
    Cruz, Leandro
    Goncalves, Nuno
    [J]. IEEE SENSORS JOURNAL, 2022, 22 (22) : 21780 - 21794
  • [40] ReadingAct RGB-D action dataset and human action recognition from local features
    Chen, Lulu
    Wei, Hong
    Ferryman, James
    [J]. PATTERN RECOGNITION LETTERS, 2014, 50 : 159 - 169