Advancing human-computer interaction: AI-driven translation of American Sign Language to Nepali using convolutional neural networks and text-to-speech conversion application

被引：0

作者：

Paneru, Biplov ^{[1
]}

Paneru, Bishwash ^{[2
]}

Poudyal, Khem Narayan ^{[2
]}

机构：

[1] Pokhara Univ, Nepal Engn Coll, Dept Elect & Commun Engn, Bhaktapur, Nepal

[2] Tribhuvan Univ, Inst Engn, Dept Appl Sci & Chem Engn, Pulchowk Campus, Lalitpur, Nepal

来源：

SYSTEMS AND SOFT COMPUTING | 2024年 / 6卷

关键词：

Human computer interaction; gTTS; American Sign Language; speech-to-text; OpenCV; Tkinter; VGG16; Convolutional neural networks;

D O I：

10.1016/j.sasc.2024.200165

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Advanced technology that serves people with impairments is severely lacking in Nepal, especially when it comes to helping the hearing impaired communicate. Although sign language is one of the oldest and most organic ways to communicate, there aren't many resources available in Nepal to help with the communication gap between Nepali and American Sign Language (ASL). This study investigates the application of Convolutional Neural Networks (CNN) and AI-driven methods for translating ASL into Nepali text and speech to bridge the technical divide. Two pre-trained transfer learning models, ResNet50 and VGG16, were refined to classify ASL signs using extensive ASL image datasets. The system utilizes the Python gTTS package to translate signs into Nepali text and speech, integrating with an OpenCV video input TKinter-based Graphical User Interface (GUI). With both CNN architectures, the model's accuracy of over 99 % allowed for the smooth conversion of ASL to speech output. By providing a workable solution to improve inclusion and communication, the deployment of an AI-driven translation system represents a significant step in lowering the technological obstacles that disabled people in Nepal must overcome.

引用

页数：16

共 2 条

[1] Text-to-Speech Translation using Support Vector Machine, an approach to find a potential path for Human-Computer Speech Synthesizer
Rashmi, S.
Hanumanthappa, M.
Jyothi, N. M.
PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 1311 - 1315
[2] Syntactic analysis and letter-to-phoneme conversion using neural networks - an application of neural networks to an English text-to-speech system
Yamaguchi, Yukiko
Matsumoto, Tatsuro
Systems and Computers in Japan, 1993, 24 (08) : 71 - 81

← 1 →