Human action recognition in immersive virtual reality based on multi-scale spatio-temporal attention network

被引:0
|
作者
Xiao, Zhiyong [1 ]
Chen, Yukun [1 ]
Zhou, Xinlei [1 ]
He, Mingwei [2 ]
Liu, Li [1 ]
Yu, Feng [1 ,2 ,3 ]
Jiang, Minghua [1 ,3 ]
机构
[1] Wuhan Text Univ, Sch Comp Sci & Artificial Intelligence, Wuhan, Peoples R China
[2] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore, Singapore
[3] Engn Res Ctr Hubei Prov Clothing Informat, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
human activity recognition; multi-scale feature; spatio-temporal feature; virtual reality; SIMULATION; SENSORS;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Wearable human action recognition (HAR) has practical applications in daily life. However, traditional HAR methods solely focus on identifying user movements, lacking interactivity and user engagement. This paper proposes a novel immersive HAR method called MovPosVR. Virtual reality (VR) technology is employed to create realistic scenes and enhance the user experience. To improve the accuracy of user action recognition in immersive HAR, a multi-scale spatio-temporal attention network (MSSTANet) is proposed. The network combines the convolutional residual squeeze and excitation (CRSE) module with the multi-branch convolution and long short-term memory (MCLSTM) module to extract spatio-temporal features and automatically select relevant features from action signals. Additionally, a multi-head attention with shared linear mechanism (MHASLM) module is designed to facilitate information interaction, further enhancing feature extraction and improving accuracy. The MSSTANet network achieves superior performance, with accuracy rates of 99.33% and 98.83% on the publicly available WISDM and PAMPA2 datasets, respectively, surpassing state-of-the-art networks. Our method showcases the potential to display user actions and position information in a virtual world, enriching user experiences and interactions across diverse application scenarios. image
引用
收藏
页数:15
相关论文
共 50 条
  • [21] MPANet: Multi-scale Pyramid Attention Network for Collaborative Modeling Spatio-Temporal Patterns of Default Mode Network
    Yuan, Hang
    Li, Xiang
    Wei, Benzheng
    ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2023, PT I, 2024, 14471 : 416 - 425
  • [22] A Spatio-Temporal Multi-Scale Binary Descriptor
    Xompero, Alessio
    Lanz, Oswald
    Cavallaro, Andrea
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 4362 - 4375
  • [23] Efficient spatio-temporal network for action recognition
    Su, Yanxiong
    Zhao, Qian
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (05)
  • [24] Multi-Scale Spatio-Temporal Memory Network for Lightweight Video Denoising
    Sun, Lu
    Wu, Fangfang
    Ding, Wei
    Li, Xin
    Lin, Jie
    Dong, Weisheng
    Shi, Guangming
    IEEE Transactions on Image Processing, 2024, 33 : 5810 - 5823
  • [25] Spatio-temporal segments attention for skeleton-based action recognition
    Qiu, Helei
    Hou, Biao
    Ren, Bo
    Zhang, Xiaohua
    NEUROCOMPUTING, 2023, 518 : 30 - 38
  • [26] A novel approach based on spatio-temporal attention and multi-scale modeling for mechanical failure prediction
    Zhai, Weimin
    Fu, Weiming
    Qin, Jiahu
    Ma, Qichao
    Kang, Yu
    CONTROL ENGINEERING PRACTICE, 2024, 147
  • [27] MSSTN: a multi-scale spatio-temporal network for traffic flow prediction
    Song, Yun
    Bai, Xinke
    Fan, Wendong
    Deng, Zelin
    Jiang, Cong
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (07) : 2827 - 2841
  • [28] Interpretable Spatio-temporal Attention for Video Action Recognition
    Meng, Lili
    Zhao, Bo
    Chang, Bo
    Huang, Gao
    Sun, Wei
    Tung, Frederich
    Sigal, Leonid
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1513 - 1522
  • [29] Resstanet: deep residual spatio-temporal attention network for violent action recognition
    Pandey A.
    Kumar P.
    International Journal of Information Technology, 2024, 16 (5) : 2891 - 2900
  • [30] Spatio-Temporal Attention Networks for Action Recognition and Detection
    Li, Jun
    Liu, Xianglong
    Zhang, Wenxuan
    Zhang, Mingyuan
    Song, Jingkuan
    Sebe, Nicu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (11) : 2990 - 3001