Video-Based Sign Language Recognition without Temporal Segmentation

被引:0
|
作者
Huang, Jie [1 ]
Zhou, Wengang [1 ]
Zhang, Qilin [2 ]
Li, Houqiang [1 ]
Li, Weiping [1 ]
机构
[1] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei, Anhui, Peoples R China
[2] HERE Technol, Chicago, IL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Millions of hearing impaired people around the world routinely use some variants of sign languages to communicate, thus the automatic translation of a sign language is meaningful and important. Currently, there are two sub-problems in Sign Language Recognition (SLR), i.e., isolated SLR that recognizes word by word and continuous SLR that translates entire sentences. Existing continuous SLR methods typically utilize isolated SLRs as building blocks, with an extra layer of preprocessing (temporal segmentation) and another layer of post-processing (sentence synthesis). Unfortunately, temporal segmentation itself is non-trivial and inevitably propagates errors into subsequent steps. Worse still, isolated SLR methods typically require strenuous labeling of each word separately in a sentence, severely limiting the amount of attainable training data. To address these challenges, we propose a novel continuous sign recognition framework, the Hierarchical Attention Network with Latent Space (LS-HAN), which eliminates the preprocessing of temporal segmentation. The proposed LS-HAN consists of three components: a two-stream Convolutional Neural Network (CNN) for video feature representation generation, a Latent Space (LS) for semantic gap bridging, and a Hierarchical Attention Network (HAN) for latent space based recognition. Experiments are carried out on two large scale datasets. Experimental results demonstrate the effectiveness of the proposed framework.
引用
收藏
页码:2257 / 2264
页数:8
相关论文
共 50 条
  • [1] Benchmark Databases for Video-Based Automatic Sign Language Recognition
    Dreuw, Philippe
    Neidle, Carol
    Athitsos, Vassilis
    Sclaroff, Stan
    Ney, Hermann
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1115 - 1120
  • [2] Video-based continuous sign language recognition using statistical methods
    Bauer, B
    Hienz, H
    Kraiss, KF
    [J]. 15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS: PATTERN RECOGNITION AND NEURAL NETWORKS, 2000, : 463 - 466
  • [3] Video-Based Vietnamese Sign Language Recognition Using Local Descriptors
    Vo, Anh H.
    Nguyen, Nhu T. Q.
    Nguyen, Ngan T. B.
    Van-Huy Pham
    Van Giap, Ta
    Nguyen, Bao T.
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2019, PT II, 2019, 11432 : 680 - 693
  • [4] Video-Based Sign Language Recognition via ResNet and LSTM Network
    Huang, Jiayu
    Chouvatut, Varin
    [J]. JOURNAL OF IMAGING, 2024, 10 (06)
  • [5] Video-based feature extraction techniques for isolated Arabic Sign Language recognition
    Shanableh, T.
    Assaleh, K.
    [J]. 2007 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1-3, 2007, : 536 - +
  • [6] Video-Based Chinese Sign Language Recognition Using Convolutional Neural Network
    Yang, Su
    Zhu, Qing
    [J]. 2017 IEEE 9TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN), 2017, : 929 - 934
  • [7] Video-based traffic sign detection and recognition
    Zhao, Qiuyu
    Shen, Yongliang
    Zhang, Yi
    [J]. 2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
  • [8] Research on Video-based Traffic Sign Recognition
    Sun, Yuge
    Li, Lei
    Ye, Ning
    Zhao, Lihong
    Lei, Hongwei
    Yang, Jie
    Sheng, Weihua
    [J]. 2017 IEEE 7TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2017, : 1500 - 1505
  • [9] Effect of Automatic Sign Recognition Performance on the Usability of Video-Based Search Interfaces for Sign Language Dictionaries
    Alonzo, Oliver
    Glasser, Abraham
    Huenerfauth, Matt
    [J]. ASSETS'19: THE 21ST INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, 2019, : 56 - 67
  • [10] Video-based isolated hand sign language recognition using a deep cascaded model
    Razieh Rastgoo
    Kourosh Kiani
    Sergio Escalera
    [J]. Multimedia Tools and Applications, 2020, 79 : 22965 - 22987