Attention-Driven Body Pose Encoding for Human Activity Recognition

被引:4
|
作者
Debnath, Bappaditya [1 ]
O'Brien, Mary [2 ]
Kumar, Swagat [1 ]
Behera, Ardhendu [1 ]
机构
[1] Edge Hill Univ, Dept Comp Sci, Ormskirk L39 4QP, England
[2] Edge Hill Univ, Fac Hlth Social Care & Med, Ormskirk L39 4QP, England
关键词
ENSEMBLE;
D O I
10.1109/ICPR48806.2021.9412487
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article proposes a novel attention-based body pose encoding for human activity recognition. Most of the existing human activity recognition approaches based on 3D pose data often enrich the input data using additional handcrafted representations such as velocity, super-normal vectors, pairwise relations, and so on. The enriched data complements the 3D body joint position data and improves model performance. In this paper, we propose a novel approach that learns enhanced feature representations from a given sequence of 3D body joints. To achieve this encoding, the approach exploits two body pose streams: 1) a spatial stream which encodes the spatial relationship between various body joints at each time point to learn spatial structure involving the spatial distribution of different body joints 2) a temporal stream that learns the temporal variation of individual body joints over the entire sequence duration to present a temporally enhanced representation. Afterwards, these two pose streams are fused with a multi-head attention mechanism. We also capture the contextual information from the RGB video stream using a deep Convolutional Neural Network (CNN) model combined with a multi-head attention and a bidirectional Long Short-Term Memory (LSTM) network. Finally, the RGB video stream is combined with the fused body pose stream to give a novel end-to-end deep model for effective human activity recognition. The proposed model is evaluated on three datasets including the challenging NTU-RGBD dataset and achieves state-of-the-art results.
引用
收藏
页码:5897 / 5904
页数:8
相关论文
共 50 条
  • [1] A lightweight attention-driven distillation model for human pose estimation
    Wei, Falai
    Hu, Xiaofang
    [J]. PATTERN RECOGNITION LETTERS, 2024, 185 : 247 - 253
  • [2] AFR-Net: Attention-Driven Fingerprint Recognition Network
    Grosz, Steven A.
    Jain, Anil K.
    [J]. IEEE TRANSACTIONS ON BIOMETRICS, BEHAVIOR, AND IDENTITY SCIENCE, 2024, 6 (01): : 30 - 42
  • [3] Attention-Driven Appearance-Motion Fusion Network for Action Recognition
    Liu, Shaocan
    Ma, Xin
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2573 - 2584
  • [4] Gender and ethnicity recognition based on visual attention-driven deep architectures
    Khellat-Kihel, Souad
    Muhammad, Jawad
    Sun, Zhenan
    Tistarelli, Massimo
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 88
  • [5] Attention-driven probability weighting
    Wang, Di
    [J]. ECONOMICS LETTERS, 2021, 203
  • [6] Indoor Scene Recognition With a Visual Attention-Driven Spatial Pooling Strategy
    Elguebaly, Tarek
    Bouguila, Nizar
    [J]. 2014 CANADIAN CONFERENCE ON COMPUTER AND ROBOT VISION (CRV), 2014, : 268 - 275
  • [7] Advancing Multilingual Handwritten Numeral Recognition With Attention-Driven Transfer Learning
    Fateh, Amirreza
    Birgani, Reza Tahmasbi
    Fateh, Mansoor
    Abolghasemi, Vahid
    [J]. IEEE ACCESS, 2024, 12 : 41381 - 41395
  • [8] Attention-Driven Projections for Soundscape Classification
    Devalraju, Dhanunjaya Varma
    Muralikrishna, H.
    Rajan, Padmanabhan
    Dinesh, Dileep Aroor
    [J]. INTERSPEECH 2020, 2020, : 1206 - 1210
  • [9] A MODEL OF ATTENTION-DRIVEN SCENE ANALYSIS
    Slaney, Malcolm
    Agus, Trevor
    Liu, Shih-Chii
    Kaya, Merve
    Elhilali, Mounya
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 145 - 148
  • [10] Attention-driven demand for bonus contracts
    Dertwinkel-Kalt, Markus
    Koester, Mats
    Peiseler, Florian
    [J]. EUROPEAN ECONOMIC REVIEW, 2019, 115 : 1 - 24