Structured Images for RGB-D Action Recognition

被引:49
|
作者
Wang, Pichao [1 ]
Wang, Shuang [2 ]
Gao, Zhimin [1 ]
Hou, Yonghong [2 ]
Li, Wanqing [1 ]
机构
[1] Univ Wollongong, Adv Multimedia Res Lab, Wollongong, NSW, Australia
[2] Tianjin Univ, Sch Elect Informat Engn, Tianjin, Peoples R China
关键词
D O I
10.1109/ICCVW.2017.123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous works that applied one Convolutional Neural Network (ConvNet) for each part/joint separately, one pair of structured dynamic images is constructed from depth maps at each granularity level and serves as the input of a ConvNet. The structured dynamic image not only preserves the spatial-temporal information but also enhances the structure information across both body parts/joints and different temporal scales. In addition, it requires low computational cost and memory to construct. This new representation, referred to as Spatially Structured Dynamic Depth Images ((SDDI)-D-2), aggregates from global to fine-grained levels motion and structure information in a depth sequence, and enables us to fine-tune the existing ConvNet models trained on image data for classification of depth sequences, without a need for training the models afresh. The proposed representation is evaluated on five benchmark datasets, namely, MSRAction3D, G3D, MSRDailyActivity3D, SYSU 3D HOI and UTD-MHAD datasets and achieves the state-of-the-art results on all five datasets.
引用
收藏
页码:1005 / 1014
页数:10
相关论文
共 50 条
  • [1] Child Action Recognition in RGB and RGB-D Data
    Turarova, Aizada
    Zhanatkyzy, Aida
    Telisheva, Zhansaule
    Sabyrov, Arman
    Sandygulova, Anara
    [J]. HRI'20: COMPANION OF THE 2020 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2020, : 491 - 492
  • [2] ACTION RECOGNITION IN RGB-D EGOCENTRIC VIDEOS
    Tang, Yansong
    Tian, Yi
    Lu, Jiwen
    Feng, Jianjiang
    Zhou, Jie
    [J]. 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 3410 - 3414
  • [3] Learning Coupled Classifiers with RGB images for RGB-D object recognition
    Li, Xiao
    Fang, Min
    Zhang, Ju-Jie
    Wu, Jinqiao
    [J]. PATTERN RECOGNITION, 2017, 61 : 433 - 446
  • [4] 3D Texture Recognition for RGB-D Images
    Zhong, Guoqiang
    Mao, Xin
    Shi, Yaxin
    Dong, Junyu
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2015, PT II, 2015, 9257 : 518 - 528
  • [5] Fusion of Skeleton and RGB Features for RGB-D Human Action Recognition
    Weiyao, Xu
    Muqing, Wu
    Min, Zhao
    Ting, Xia
    [J]. IEEE SENSORS JOURNAL, 2021, 21 (17) : 19157 - 19164
  • [6] Deep Bilinear Learning for RGB-D Action Recognition
    Hu, Jian-Fang
    Zheng, Wei-Shi
    Pan, Jiahui
    Lai, Jianhuang
    Zhang, Jianguo
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 346 - 362
  • [7] Joint Deep Learning for RGB-D Action Recognition
    Qin, Xiaolei
    Ge, Yongxin
    Zhan, Liuwei
    Li, Guangrui
    Huang, Sheng
    Wang, Hongxing
    Chen, Feiyu
    Wang, Hongxing
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP), 2018,
  • [8] Viewpoint Invariant RGB-D Human Action Recognition
    Liu, Jian
    Akhtar, Naveed
    Mian, Ajmal
    [J]. 2017 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING - TECHNIQUES AND APPLICATIONS (DICTA), 2017, : 261 - 268
  • [9] RGB-D action recognition using linear coding
    Liu, Huaping
    Yuan, Mingyi
    Sun, Fuchun
    [J]. NEUROCOMPUTING, 2015, 149 : 79 - 85
  • [10] Visual Recognition in RGB Images and Videos by Learning from RGB-D Data
    Li, Wen
    Chen, Lin
    Xu, Dong
    Van Gool, Luc
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (08) : 2030 - 2036