Workout Action Recognition in Video Streams Using an Attention Driven Residual DC-GRU Network

被引:5
|
作者
Dey, Arnab [1 ]
Biswas, Samit [1 ]
Le, Dac-Nhuong [2 ]
机构
[1] Indian Inst Engn Sci & Technol, Dept Comp Sci & Technol, Sibpur 711103, Howrah, India
[2] Haiphong Univ, Fac Informat Technol, Haiphong 180000, Vietnam
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2024年 / 79卷 / 02期
关键词
Workout action recognition; video stream; action recognition; residual network; GRU; attention; SPATIOTEMPORAL FEATURES; LSTM;
D O I
10.32604/cmc.2024.049512
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers the likelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in video streams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enable instant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing action datasets often lack diversity and specificity for workout actions, hindering the development of accurate recognition models. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significant contribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated to encompass various exercises performed by numerous individuals in different settings. This research proposes an innovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU) network for workout action recognition in video streams. Unlike image-based action recognition, videos contain spatio-temporal information, making the task more complex and challenging. While substantial progress has been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions, and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attention model demonstrated exceptional classification performance with 95.81% accuracy in classifying workout action videos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and 93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101, respectively, showcasing its superiority and robustness in action recognition. The findings suggest practical implications in real-world scenarios where precise video action recognition is paramount, addressing the persisting challenges in the field. The WAVd dataset serves as a catalyst for the development of more robust and effective fitness tracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis.
引用
收藏
页码:3067 / 3087
页数:21
相关论文
共 50 条
  • [21] Separable 3D residual attention network for human action recognition
    Zufan Zhang
    Yue Peng
    Chenquan Gan
    Andrea Francesco Abate
    Lianxiang Zhu
    Multimedia Tools and Applications, 2023, 82 : 5435 - 5453
  • [22] EPAM-Net: An efficient pose-driven attention-guided multimodal network for video action recognition
    Abdelkawy, Ahmed
    Ali, Asem
    Farag, Aly
    NEUROCOMPUTING, 2025, 633
  • [23] Spatiotemporal squeeze-and-excitation residual multiplier network for video action recognition
    Luo H.
    Tong K.
    Tongxin Xuebao/Journal on Communications, 2019, 40 (10): : 189 - 198
  • [24] TBRNet: Two-Stream BiLSTM Residual Network for Video Action Recognition
    Wu, Xiao
    Ji, Qingge
    ALGORITHMS, 2020, 13 (07) : 1 - 21
  • [25] Two-Level Attention Model Based Video Action Recognition Network
    Sang, Haifeng
    Zhao, Ziyu
    He, Dakuo
    IEEE ACCESS, 2019, 7 : 118388 - 118401
  • [26] Attention-Driven Appearance-Motion Fusion Network for Action Recognition
    Liu, Shaocan
    Ma, Xin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2573 - 2584
  • [27] R-STAN: Residual Spatial-Temporal Attention Network for Action Recognition
    Liu, Quanle
    Che, Xiangjiu
    Bie, Mei
    IEEE ACCESS, 2019, 7 : 82246 - 82255
  • [28] Resstanet: deep residual spatio-temporal attention network for violent action recognition
    Pandey A.
    Kumar P.
    International Journal of Information Technology, 2024, 16 (5) : 2891 - 2900
  • [29] Spatiotemporal information deep fusion network with frame attention mechanism for video action recognition
    Ou, Hongshi
    Sun, Jifeng
    JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (02)
  • [30] Context-Aware Memory Attention Network for Video-Based Action Recognition
    Koh, Thean Chun
    Yeo, Chai Kiat
    Vaitesswar, U. S.
    Jing, Xuan
    2022 IEEE 14TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2022,