Continuous Sign Language Recognition Based on Spatial-Temporal Graph Attention Network

被引:4
|
作者
Guo, Qi [1 ]
Zhang, Shujun [1 ]
Li, Hui [1 ]
机构
[1] Qingdao Univ Sci & Technol, Coll Informat Sci & Technol, Qingdao 266061, Peoples R China
来源
关键词
Continuous sign language recognition; graph attention network; bidirectional long short-term memory; connectionist temporal classification;
D O I
10.32604/cmes.2022.021784
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Continuous sign language recognition (CSLR) is challenging due to the complexity of video background, hand gesture variability, and temporal modeling difficulties. This work proposes a CSLR method based on a spatial-temporal graph attention network to focus on essential features of video series. The method considers local details of sign language movements by taking the information on joints and bones as inputs and constructing a spatial-temporal graph to reflect inter-frame relevance and physical connections between nodes. The graph-based multi-head attention mechanism is utilized with adjacent matrix calculation for better local-feature exploration, and short-term motion correlation modeling is completed via a temporal convolutional network. We adopted BLSTM to learn the long-term dependence and connectionist temporal classification to align the word-level sequences. The proposed method achieves competitive results regarding word error rates (1.59%) on the Chinese Sign Language dataset and the mean Jaccard Index (65.78%) on the ChaLearn LAP Continuous Gesture Dataset.
引用
收藏
页码:1653 / 1670
页数:18
相关论文
共 50 条
  • [1] Spatial-temporal attention with graph and general neural network-based sign language recognition
    Miah, Abu Saleh Musa
    Hasan, Md. Al Mehedi
    Okuyama, Yuichi
    Tomioka, Yoichi
    Shin, Jungpil
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2024, 27 (02)
  • [2] Spatial-Temporal Enhanced Network for Continuous Sign Language Recognition
    Yin, Wenjie
    Hou, Yonghong
    Guo, Zihui
    Liu, Kailin
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1684 - 1695
  • [3] Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition
    Zhou, Hao
    Zhou, Wengang
    Zhou, Yun
    Li, Hougiang
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13009 - 13016
  • [4] Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition
    de Amorim, Cleison Correia
    Macedo, David
    Zanchettin, Cleber
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: WORKSHOP AND SPECIAL SESSIONS, 2019, 11731 : 646 - 657
  • [5] Sign Language Recognition Based on Spatial-Temporal Graph Convolution-Transformer
    Takayama, Natsuki
    Benitez-Garcia, Gibran
    Takahashi, Hiroki
    [J]. Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering, 2021, 87 (12): : 1028 - 1035
  • [6] Spatial-Temporal gated graph attention network for skeleton-based action recognition
    Rahevar, Mrugendrasinh
    Ganatra, Amit
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (03) : 929 - 939
  • [7] Structure-aware sign language recognition with spatial-temporal scene graph
    Lin, Shiquan
    Xiao, Zhengye
    Wang, Lixin
    Wan, Xiuan
    Ni, Lan
    Fang, Yuchun
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (06)
  • [8] Spatial-Temporal Dynamic Graph Attention Network for Skeleton-Based Action Recognition
    Rahevar, Mrugendrasinh
    Ganatra, Amit
    Saba, Tanzila
    Rehman, Amjad
    Bahaj, Saeed Ali
    [J]. IEEE ACCESS, 2023, 11 : 21546 - 21553
  • [9] Spatial-temporal graph neural network based on node attention
    Li, Qiang
    Wan, Jun
    Zhang, Wucong
    Kweh, Qian Long
    [J]. APPLIED MATHEMATICS AND NONLINEAR SCIENCES, 2022, 7 (02) : 703 - 712
  • [10] Spatial-Temporal Multi-Cue Network for Sign Language Recognition and Translation
    Zhou, Hao
    Zhou, Wengang
    Zhou, Yun
    Li, Houqiang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 768 - 779