Multi-Stroke Thai Finger-Spelling Sign Language Recognition System with Deep Learning

被引:8
|
作者
Pariwat, Thongpan [1 ]
Seresangtakul, Pusadee [1 ]
机构
[1] Khon Kaen Univ, Nat Language & Speech Proc Lab, Fac Sci, Dept Comp Sci, Khon Kaen 40002, Thailand
来源
SYMMETRY-BASEL | 2021年 / 13卷 / 02期
关键词
TFSL recognition system; deep learning; semantic segmentation; optical flow; complex background;
D O I
10.3390/sym13020262
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Sign language is a type of language for the hearing impaired that people in the general public commonly do not understand. A sign language recognition system, therefore, represents an intermediary between the two sides. As a communication tool, a multi-stroke Thai finger-spelling sign language (TFSL) recognition system featuring deep learning was developed in this study. This research uses a vision-based technique on a complex background with semantic segmentation performed with dilated convolution for hand segmentation, hand strokes separated using optical flow, and learning feature and classification done with convolution neural network (CNN). We then compared the five CNN structures that define the formats. The first format was used to set the number of filters to 64 and the size of the filter to 3 x 3 with 7 layers; the second format used 128 filters, each filter 3 x 3 in size with 7 layers; the third format used the number of filters in ascending order with 7 layers, all of which had an equal 3 x 3 filter size; the fourth format determined the number of filters in ascending order and the size of the filter based on a small size with 7 layers; the final format was a structure based on AlexNet. As a result, the average accuracy was 88.83%, 87.97%, 89.91%, 90.43%, and 92.03%, respectively. We implemented the CNN structure based on AlexNet to create models for multi-stroke TFSL recognition systems. The experiment was performed using an isolated video of 42 Thai alphabets, which are divided into three categories consisting of one stroke, two strokes, and three strokes. The results presented an 88.00% average accuracy for one stroke, 85.42% for two strokes, and 75.00% for three strokes.
引用
收藏
页码:1 / 19
页数:19
相关论文
共 50 条
  • [1] Thai Finger-Spelling Sign Language Recognition Using Global and Local Features with SVM
    Pariwat, Thongpan
    Seresangtakul, Pusadee
    [J]. 2017 9TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST), 2017, : 116 - 120
  • [2] Explicit quaternion krawtchouk moment invariants for finger-spelling sign language recognition
    Elouariachi, Ilham
    Benouini, Rachid
    Zenkouar, Khalid
    Zarghili, Arsalane
    El Fadili, Hakim
    [J]. 28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 620 - 624
  • [3] American Sign Language finger spelling recognition system
    Allen, JM
    Asselin, PK
    Foulds, R
    [J]. PROCEEDINGS OF THE IEEE 29TH ANNUAL NORTHEAST BIOENGINEERING CONFERENCE, 2003, : 285 - 286
  • [4] Deep multimodal-based finger spelling recognition for Thai sign language: a new benchmark and model composition
    Vijitkunsawat, Wuttichai
    Racharak, Teeradaj
    Le Nguyen, Minh
    [J]. MACHINE VISION AND APPLICATIONS, 2024, 35 (04)
  • [5] Finger Spelling Recognition for Nepali Sign Language
    Thapa, Vivek
    Sunuwar, Jhuma
    Pradhan, Ratika
    [J]. RECENT DEVELOPMENTS IN MACHINE LEARNING AND DATA ANALYTICS, 2019, 740 : 219 - 227
  • [6] An Ego-camera Based Finger-spelling Recognition System
    Tan, Joo Kooi
    Hamada, Satoshi
    Hirakawa, Manabu
    Kim, Hyoungseop
    Ishikawa, Seiji
    [J]. PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 358 - 363
  • [7] American Sign Language-Based Finger-spelling Recognition using k-Nearest Neighbors Classifier
    Aryanie, Dewinta
    Heryadi, Yaya
    [J]. 2015 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOICT), 2015, : 533 - 536
  • [8] User-independent system for sign language finger spelling recognition
    Dahmani, Djamila
    Larabi, Slimane
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2014, 25 (05) : 1240 - 1250
  • [9] Thai Finger-Spelling Recognition Using a Cascaded Classifier Based on Histogram of Orientation Gradient Features
    Silanon, Kittasil
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2017, 2017
  • [10] Finger-Spelling Recognition System using Fuzzy Finger Shape and Hand Appearance Features
    Silanon, Kittasil
    Suvonvorn, Nikom
    [J]. 2014 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION AND COMMUNICATION TECHNOLOGY AND IT'S APPLICATIONS (DICTAP), 2014, : 419 - 424