Structured Images for RGB-D Action Recognition

被引：49

作者：

Wang, Pichao ^{[1
]}

Wang, Shuang ^{[2
]}

Gao, Zhimin ^{[1
]}

Hou, Yonghong ^{[2
]}

Li, Wanqing ^{[1
]}

机构：

[1] Univ Wollongong, Adv Multimedia Res Lab, Wollongong, NSW, Australia

[2] Tianjin Univ, Sch Elect Informat Engn, Tianjin, Peoples R China

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017) | 2017年

关键词：

D O I：

10.1109/ICCVW.2017.123

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous works that applied one Convolutional Neural Network (ConvNet) for each part/joint separately, one pair of structured dynamic images is constructed from depth maps at each granularity level and serves as the input of a ConvNet. The structured dynamic image not only preserves the spatial-temporal information but also enhances the structure information across both body parts/joints and different temporal scales. In addition, it requires low computational cost and memory to construct. This new representation, referred to as Spatially Structured Dynamic Depth Images ((SDDI)-D-2), aggregates from global to fine-grained levels motion and structure information in a depth sequence, and enables us to fine-tune the existing ConvNet models trained on image data for classification of depth sequences, without a need for training the models afresh. The proposed representation is evaluated on five benchmark datasets, namely, MSRAction3D, G3D, MSRDailyActivity3D, SYSU 3D HOI and UTD-MHAD datasets and achieves the state-of-the-art results on all five datasets.

引用

页码：1005 / 1014

页数：10

共 50 条

[31] Action Tube Extraction based 3D-CNN for RGB-D Action Recognition
Xu, Zineng
Vilaplana, Veronica
Ramon Morros, Josep
[J]. 2018 16TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2018,
[32] Domain adaptation from RGB-D to RGB images
Li, Xiao
Fang, Min
Zhang, Ju-Jie
Wu, Jinqiao
[J]. SIGNAL PROCESSING, 2017, 131 : 27 - 35
[33] Structure-Preserving Binary Representations for RGB-D Action Recognition
Yu, Mengyang
Liu, Li
Shao, Ling
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (08) : 1651 - 1664
[34] Human Action Recognition with Contextual Constraints using a RGB-D Sensor
Gu, Ye
Sheng, Weihua
Ou, Yongsheng
Liu, Meiqin
Zhang, Senlin
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2013, : 674 - 679
[35] Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition
Wang, Pichao
Li, Wanqing
Wan, Jun
Ogunbona, Philip
Liu, Xinwang
[J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7404 - 7411
[36] Unsupervised Segmentation of RGB-D Images
Deng, Zhuo
Latecki, Longin Jan
[J]. COMPUTER VISION - ACCV 2014, PT III, 2015, 9005 : 423 - 435
[37] Incremental Registration of RGB-D Images
Dryanovski, Ivan
Jaramillo, Carlos
Xiao, Jizhong
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2012, : 1685 - 1690
[38] Linear RGB-D SLAM for Structured Environments
Joo, Kyungdon
Kim, Pyojin
Hebert, Martial
Kweon, In So
Kim, Hyoun Jin
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) : 8403 - 8419
[39] Pseudo RGB-D Face Recognition
Jin, Bo
Cruz, Leandro
Goncalves, Nuno
[J]. IEEE SENSORS JOURNAL, 2022, 22 (22) : 21780 - 21794
[40] ReadingAct RGB-D action dataset and human action recognition from local features
Chen, Lulu
Wei, Hong
Ferryman, James
[J]. PATTERN RECOGNITION LETTERS, 2014, 50 : 159 - 169

← 1 2 3 4 5 →