Workout Action Recognition in Video Streams Using an Attention Driven Residual DC-GRU Network

被引:5
|
作者
Dey, Arnab [1 ]
Biswas, Samit [1 ]
Le, Dac-Nhuong [2 ]
机构
[1] Indian Inst Engn Sci & Technol, Dept Comp Sci & Technol, Sibpur 711103, Howrah, India
[2] Haiphong Univ, Fac Informat Technol, Haiphong 180000, Vietnam
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2024年 / 79卷 / 02期
关键词
Workout action recognition; video stream; action recognition; residual network; GRU; attention; SPATIOTEMPORAL FEATURES; LSTM;
D O I
10.32604/cmc.2024.049512
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers the likelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in video streams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enable instant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing action datasets often lack diversity and specificity for workout actions, hindering the development of accurate recognition models. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significant contribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated to encompass various exercises performed by numerous individuals in different settings. This research proposes an innovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU) network for workout action recognition in video streams. Unlike image-based action recognition, videos contain spatio-temporal information, making the task more complex and challenging. While substantial progress has been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions, and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attention model demonstrated exceptional classification performance with 95.81% accuracy in classifying workout action videos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and 93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101, respectively, showcasing its superiority and robustness in action recognition. The findings suggest practical implications in real-world scenarios where precise video action recognition is paramount, addressing the persisting challenges in the field. The WAVd dataset serves as a catalyst for the development of more robust and effective fitness tracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis.
引用
收藏
页码:3067 / 3087
页数:21
相关论文
共 50 条
  • [11] Multipath Attention and Adaptive Gating Network for Video Action Recognition
    Zhang, Haiping
    Hu, Zepeng
    Yu, Dongjin
    Guan, Liming
    Liu, Xu
    Ma, Conghao
    NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [12] Emotion Recognition in Video Streams Using Intramodal and Intermodal Attention Mechanisms
    Mocanu, Bogdan
    Tapu, Ruxandra
    ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT II, 2022, 13599 : 295 - 306
  • [13] Video-based action recognition using spurious-3D residual attention networks
    Chen, Bo
    Tang, Hongying
    Zhang, Zebin
    Tong, Guanjun
    Li, Baoqing
    IET IMAGE PROCESSING, 2022, 16 (11) : 3097 - 3111
  • [14] An attention mechanism based convolutional LSTM network for video action recognition
    Ge, Hongwei
    Yan, Zehang
    Yu, Wenhao
    Sun, Liang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (14) : 20533 - 20556
  • [15] CHANNEL-WISE TEMPORAL ATTENTION NETWORK FOR VIDEO ACTION RECOGNITION
    Lei, Jianjun
    Jia, Yalong
    Peng, Bo
    Huang, Qingming
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 562 - 567
  • [16] An attention mechanism based convolutional LSTM network for video action recognition
    Hongwei Ge
    Zehang Yan
    Wenhao Yu
    Liang Sun
    Multimedia Tools and Applications, 2019, 78 : 20533 - 20556
  • [17] CANet: Comprehensive Attention Network for video-based action recognition
    Gao, Xiong
    Chang, Zhaobin
    Ran, Xingcheng
    Lu, Yonggang
    KNOWLEDGE-BASED SYSTEMS, 2024, 296
  • [18] Human Action Representation Learning Using an Attention-Driven Residual 3DCNN Network
    Ullah, Hayat
    Munir, Arslan
    ALGORITHMS, 2023, 16 (08)
  • [19] Human Action Recognition in Unconstrained Trimmed Videos Using Residual Attention Network and Joints Path Signature
    Ahmad, Tasweer
    Jin, Lianwen
    Feng, Jialuo
    Tang, Guozhi
    IEEE ACCESS, 2019, 7 : 121212 - 121222
  • [20] Separable 3D residual attention network for human action recognition
    Zhang, Zufan
    Peng, Yue
    Gan, Chenquan
    Abate, Andrea Francesco
    Zhu, Lianxiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (04) : 5435 - 5453