Advancing human-computer interaction: AI-driven translation of American Sign Language to Nepali using convolutional neural networks and text-to-speech conversion application

被引:0
|
作者
Paneru, Biplov [1 ]
Paneru, Bishwash [2 ]
Poudyal, Khem Narayan [2 ]
机构
[1] Pokhara Univ, Nepal Engn Coll, Dept Elect & Commun Engn, Bhaktapur, Nepal
[2] Tribhuvan Univ, Inst Engn, Dept Appl Sci & Chem Engn, Pulchowk Campus, Lalitpur, Nepal
来源
关键词
Human computer interaction; gTTS; American Sign Language; speech-to-text; OpenCV; Tkinter; VGG16; Convolutional neural networks;
D O I
10.1016/j.sasc.2024.200165
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Advanced technology that serves people with impairments is severely lacking in Nepal, especially when it comes to helping the hearing impaired communicate. Although sign language is one of the oldest and most organic ways to communicate, there aren't many resources available in Nepal to help with the communication gap between Nepali and American Sign Language (ASL). This study investigates the application of Convolutional Neural Networks (CNN) and AI-driven methods for translating ASL into Nepali text and speech to bridge the technical divide. Two pre-trained transfer learning models, ResNet50 and VGG16, were refined to classify ASL signs using extensive ASL image datasets. The system utilizes the Python gTTS package to translate signs into Nepali text and speech, integrating with an OpenCV video input TKinter-based Graphical User Interface (GUI). With both CNN architectures, the model's accuracy of over 99 % allowed for the smooth conversion of ASL to speech output. By providing a workable solution to improve inclusion and communication, the deployment of an AI-driven translation system represents a significant step in lowering the technological obstacles that disabled people in Nepal must overcome.
引用
收藏
页数:16
相关论文
共 2 条