Multi-Stream General and Graph-Based Deep Neural Networks for Skeleton-Based Sign Language Recognition

被引:10
|
作者
Miah, Abu Saleh Musa [1 ]
Hasan, Md. Al Mehedi [2 ]
Jang, Si-Woong [3 ]
Lee, Hyoun-Sup [4 ]
Shin, Jungpil [1 ]
机构
[1] Univ Aizu, Sch Comp Sci & Engn, Aizu Wakamatsu 9658580, Japan
[2] Rajshahi Univ Engn & Technol RUET, Dept Comp Sci & Engn, Rajshahi 6204, Bangladesh
[3] Dong Eui Univ, Dept Comp Engn, Busan 47340, South Korea
[4] Dong Eui Univ, Dept Appl Software Engn, Busan 47340, South Korea
关键词
sign language recognition (SLR); large scale dataset; American Sign Language; Turkish Sign Language; Chinese Sign Language; AUTSL; CSL;
D O I
10.3390/electronics12132841
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sign language recognition (SLR) aims to bridge speech-impaired and general communities by recognizing signs from given videos. However, due to the complex background, light illumination, and subject structures in videos, researchers still face challenges in developing effective SLR systems. Many researchers have recently sought to develop skeleton-based sign language recognition systems to overcome the subject and background variation in hand gesture sign videos. However, skeleton-based SLR is still under exploration, mainly due to a lack of information and hand key point annotations. More recently, researchers have included body and face information along with hand gesture information for SLR; however, the obtained performance accuracy and generalizability properties remain unsatisfactory. In this paper, we propose a multi-stream graph-based deep neural network (SL-GDN) for a skeleton-based SLR system in order to overcome the above-mentioned problems. The main purpose of the proposed SL-GDN approach is to improve the generalizability and performance accuracy of the SLR system while maintaining a low computational cost based on the human body pose in the form of 2D landmark locations. We first construct a skeleton graph based on 27 whole-body key points selected among 67 key points to address the high computational cost problem. Then, we utilize the multi-stream SL-GDN to extract features from the whole-body skeleton graph considering four streams. Finally, we concatenate the four different features and apply a classification module to refine the features and recognize corresponding sign classes. Our data-driven graph construction method increases the system's flexibility and brings high generalizability, allowing it to adapt to varied data. We use two large-scale benchmark SLR data sets to evaluate the proposed model: The Turkish Sign Language data set (AUTSL) and Chinese Sign Language (CSL). The reported performance accuracy results demonstrate the outstanding ability of the proposed model, and we believe that it will be considered a great innovation in the SLR domain.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Multi-stream mixed graph convolutional networks for skeleton-based action recognition
    Zhuang, Boyuan
    Kong, Jun
    Jiang, Min
    Liu, Tianshan
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (06)
  • [2] Skeleton-Based Action Recognition With Multi-Stream Adaptive Graph Convolutional Networks
    Shi, Lei
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 9532 - 9545
  • [3] Multi-stream slowFast graph convolutional networks for skeleton-based action recognition
    Sun, Ning
    Leng, Ling
    Liu, Jixin
    Han, Guang
    [J]. IMAGE AND VISION COMPUTING, 2021, 109
  • [4] Multi-stream part-fused graph convolutional networks for skeleton-based gait recognition
    Wang, Likai
    Chen, Jinyan
    Chen, Zhenghang
    Liu, Yuxin
    Yang, Haolin
    [J]. CONNECTION SCIENCE, 2022, 34 (01) : 652 - 669
  • [5] Multi-stream P&U adaptive graph convolutional networks for skeleton-based action recognition
    Chen, Minglong
    Liang, Jiuzhen
    Liu, Hao
    [J]. JOURNAL OF SUPERCOMPUTING, 2024, 80 (08): : 11614 - 11639
  • [6] Multi-stream P&U adaptive graph convolutional networks for skeleton-based action recognition
    Minglong Chen
    Jiuzhen Liang
    Hao Liu
    [J]. The Journal of Supercomputing, 2024, 80 : 11614 - 11639
  • [7] Multi-stream ternary enhanced graph convolutional network for skeleton-based action recognition
    Kong, Jun
    Wang, Shengquan
    Jiang, Min
    Liu, TianShan
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (25): : 18487 - 18504
  • [8] MS-GTR: Multi-stream Graph Transformer for Skeleton-Based Action Recognition
    Zhao, Weichao
    Peng, Jingliang
    Lv, Na
    [J]. ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT III, 2024, 14497 : 104 - 118
  • [9] Multi-stream ternary enhanced graph convolutional network for skeleton-based action recognition
    Jun Kong
    Shengquan Wang
    Min Jiang
    TianShan Liu
    [J]. Neural Computing and Applications, 2023, 35 : 18487 - 18504
  • [10] A Multi-Stream Graph Convolutional Networks-Hidden Conditional Random Field Model for Skeleton-Based Action Recognition
    Liu, Kai
    Gao, Lei
    Khan, Naimul Mefraz
    Qi, Lin
    Guan, Ling
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 64 - 76