A 4D strong spatio-temporal feature learning network for behavior recognition of point cloud sequences

被引:0
|
作者
You, Kaijun [1 ]
Hou, Zhenjie [1 ]
Liang, Jiuzhen [1 ]
Lin, En [2 ]
Shi, Haiyong [1 ]
Zhong, Zhuokun [1 ]
机构
[1] Changzhou Univ, Engn Res Ctr Key Equipment Petrochem Proc, Changzhou 213100, Jiangsu, Peoples R China
[2] Goldcard Smart Grp Co Ltd, Hangzhou 310000, Zhejiang, Peoples R China
关键词
Point cloud sequence; Coordinate transformation; Spatio-temporal information; Feature enhancement;
D O I
10.1007/s11042-023-18045-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although the depth map sequence widely used in behavior recognition can provide depth information. However, depth pixels are not strongly correlated with each other, and the spatio-temporal structure information of behavior data is largely lost. Point cloud data can provide rich spatial information and geometric features, which make up for the lack of depth images. In order to further utilize the geometric information of behavior action and improve the utilization rate of spatio-temporal structure information, this paper proposed a 4D strong spatio-temporal feature learning network for behavior recognition of point cloud sequences. Coordinate transformation was performed on a depth dataset to generate a point cloud dataset, then our network processed each frame of point cloud data and learned 4D strong spatio-temporal features (three spatial and one temporal dimension). The network consists of two modules, a spatial-level feature learning module and a temporal-level position encoding module. In the spatial-level feature learning module, the spatial dimension of the point cloud is processed and learned. Each frame of point cloud data outputs a feature sequence through two progressive structure enhanced set abstract layers, which represents the strong spatial structure. Then, it becomes a complete spatial-level feature sequence through a maxing pooling operation. In the temporal-level position coding module, the processing and learning of the time dimension of the point cloud are performed. The time-series information is injected into the feature sequence through position coding and so on. Finally, the multi-level features of human actions are aggregated and classified. It was carried out on three public datasets. Extensive experiments showed that the network structure proposed in this paper outperformed the current state-of-the-art methods.
引用
收藏
页码:63193 / 63211
页数:19
相关论文
共 50 条
  • [21] Adaptive spatio-temporal restoration for 4D fluorescence microscopic imaging
    Boulanger, J
    Kervrann, C
    Bouthemy, P
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2005, PT 1, 2005, 3749 : 893 - 901
  • [22] 4D Attention: Comprehensive Framework for Spatio-Temporal Gaze Mapping
    Oishi, Shuji
    Koide, Kenji
    Yokozuka, Masashi
    Banno, Atsuhiko
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04): : 7240 - 7247
  • [23] Adaptive Spatio-temporal Filtering of 4D CT-Heart
    Andersson, Mats
    Knutsson, Hans
    IMAGE ANALYSIS, SCIA 2013: 18TH SCANDINAVIAN CONFERENCE, 2013, 7944 : 246 - 255
  • [24] Human Action Recognition Using Spatio-Temporal Multiplier Network and Attentive Correlated Temporal Feature
    Indhumathi, C.
    Murugan, V
    Muthulakshmii, G.
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2022, 22 (05)
  • [25] Learning and retrieval of spatio-temporal sequences in the hippocampal network with theta phase precession
    Wu, ZH
    Yamaguchi, Y
    8TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING, VOLS 1-3, PROCEEDING, 2001, : 671 - 676
  • [26] Learning and retrieving spatio-temporal sequences with any static associative neural network
    Wang, LP
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-ANALOG AND DIGITAL SIGNAL PROCESSING, 1998, 45 (06): : 729 - 738
  • [27] TrajectoryCNN: A New Spatio-Temporal Feature Learning Network for Human Motion Prediction
    Liu, Xiaoli
    Yin, Jianqin
    Liu, Jin
    Ding, Pengxiang
    Liu, Jun
    Liu, Huaping
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (06) : 2133 - 2146
  • [28] Spatio-Temporal Convolutional LSTMs for Tumor Growth Prediction by Learning 4D Longitudinal Patient Data
    Zhang, Ling
    Lu, Le
    Wang, Xiaosong
    Zhu, Robert M.
    Bagheri, Mohammadhadi
    Summers, Ronald M.
    Yao, Jianhua
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2020, 39 (04) : 1114 - 1126
  • [29] Deep learning with 4D spatio-temporal data representations for OCT-based force estimation
    Gessert, Nils
    Bengs, Marcel
    Schlueter, Matthias
    Schlaefer, Alexander
    MEDICAL IMAGE ANALYSIS, 2020, 64
  • [30] Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction
    Tang, Jiapeng
    Xu, Dan
    Jia, Kui
    Zhang, Lei
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6018 - 6027