A 4D strong spatio-temporal feature learning network for behavior recognition of point cloud sequences

被引:0
|
作者
You, Kaijun [1 ]
Hou, Zhenjie [1 ]
Liang, Jiuzhen [1 ]
Lin, En [2 ]
Shi, Haiyong [1 ]
Zhong, Zhuokun [1 ]
机构
[1] Changzhou Univ, Engn Res Ctr Key Equipment Petrochem Proc, Changzhou 213100, Jiangsu, Peoples R China
[2] Goldcard Smart Grp Co Ltd, Hangzhou 310000, Zhejiang, Peoples R China
关键词
Point cloud sequence; Coordinate transformation; Spatio-temporal information; Feature enhancement;
D O I
10.1007/s11042-023-18045-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although the depth map sequence widely used in behavior recognition can provide depth information. However, depth pixels are not strongly correlated with each other, and the spatio-temporal structure information of behavior data is largely lost. Point cloud data can provide rich spatial information and geometric features, which make up for the lack of depth images. In order to further utilize the geometric information of behavior action and improve the utilization rate of spatio-temporal structure information, this paper proposed a 4D strong spatio-temporal feature learning network for behavior recognition of point cloud sequences. Coordinate transformation was performed on a depth dataset to generate a point cloud dataset, then our network processed each frame of point cloud data and learned 4D strong spatio-temporal features (three spatial and one temporal dimension). The network consists of two modules, a spatial-level feature learning module and a temporal-level position encoding module. In the spatial-level feature learning module, the spatial dimension of the point cloud is processed and learned. Each frame of point cloud data outputs a feature sequence through two progressive structure enhanced set abstract layers, which represents the strong spatial structure. Then, it becomes a complete spatial-level feature sequence through a maxing pooling operation. In the temporal-level position coding module, the processing and learning of the time dimension of the point cloud are performed. The time-series information is injected into the feature sequence through position coding and so on. Finally, the multi-level features of human actions are aggregated and classified. It was carried out on three public datasets. Extensive experiments showed that the network structure proposed in this paper outperformed the current state-of-the-art methods.
引用
收藏
页码:63193 / 63211
页数:19
相关论文
共 50 条
  • [41] 3D modeling using hierarchical feature point and spatio-temporal relationship
    Lee, HK
    Kwon, SK
    Kim, HS
    Ha, YH
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2001, 2001, 4310 : 787 - 797
  • [42] Attention guided spatio-temporal network for 3D signature recognition
    Singh, Aradhana Kumari
    Koundal, Deepika
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (11) : 33985 - 33997
  • [43] Attention guided spatio-temporal network for 3D signature recognition
    Aradhana Kumari Singh
    Deepika Koundal
    Multimedia Tools and Applications, 2024, 83 : 33985 - 33997
  • [44] Single and interactive human behavior recognition algorithm based on spatio-temporal interest point
    Wang, Shi-Gang
    Sun, Ai-Meng
    Zhao, Wen-Ting
    Hui, Xiang-Long
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2015, 45 (01): : 304 - 308
  • [45] SVQNet: Sparse Voxel-Adjacent Query Network for 4D Spatio-Temporal LiDAR Semantic Segmentation
    Chen, Xuechao
    Xu, Shuangjie
    Zou, Xiaoyi
    Cao, Tongyi
    Yeung, Dit-Yan
    Fang, Lu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8535 - 8544
  • [46] A Guided Attention 4D Convolutional Neural Network for Modeling Spatio-Temporal Patterns of Functional Brain Networks
    Yan, Jiadong
    Zhao, Yu
    Jiang, Mingxin
    Zhang, Shu
    Zhang, Tuo
    Yang, Shimin
    Chen, Yuzhong
    Zhao, Zhongbo
    He, Zhibin
    Becker, Benjamin
    Liu, Tianming
    Kendrick, Keith
    Jiang, Xi
    PATTERN RECOGNITION AND COMPUTER VISION,, PT III, 2021, 13021 : 350 - 361
  • [47] Multi-parameter visualisation of 3D/4D spatio-temporal data
    Brown, IM
    GEOGRAPHICAL INFORMATION '97: FROM RESEARCH TO APPLICATION THROUGH COOPERATION, VOLS 1 AND 2, 1997, : 566 - 574
  • [48] Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences
    Wang, Guangming
    Liu, Hanwen
    Chen, Muyao
    Yang, Yehui
    Liu, Zhe
    Wang, Hesheng
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [49] Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos
    Shen, Zhiqiang
    Sheng, Xiaoxiao
    Fan, Hehe
    Wang, Longguang
    Guo, Yulan
    Liu, Qiong
    Wen, Hao
    Zhou, Xi
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16534 - 16543
  • [50] Phase Space Reconstruction Driven Spatio-Temporal Feature Learning for Dynamic Facial Expression Recognition
    Wang, Shanmin
    Shuai, Hui
    Liu, Qingshan
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (03) : 1466 - 1476