Audio-visual speech recognition techniques in augmented reality environments

被引：15

作者：

Mirzaei, Mohammad Reza ^{[1
]}

Ghorshi, Seyed ^{[1
]}

Mortazavi, Mohammad ^{[1
]}

机构：

[1] Sharif Univ Technol, Sch Sci & Engn, Kish Isl, Iran

来源：

VISUAL COMPUTER | 2014年 / 30卷 / 03期

关键词：

Augmented reality; Audio-visual speech recognition; Augmented reality environments; Communication; Deaf people; VIRTUAL-REALITY;

D O I：

10.1007/s00371-013-0841-1

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Many recent studies show that Augmented Reality (AR) and Automatic Speech Recognition (ASR) technologies can be used to help people with disabilities. Many of these studies have been performed only in their specialized field. Audio-Visual Speech Recognition (AVSR) is one of the advances in ASR technology that combines audio, video, and facial expressions to capture a narrator's voice. In this paper, we combine AR and AVSR technologies to make a new system to help deaf and hard-of-hearing people. Our proposed system can take a narrator's speech instantly and convert it into a readable text and show the text directly on an AR display. Therefore, in this system, deaf people can read the narrator's speech easily. In addition, people do not need to learn sign-language to communicate with deaf people. The evaluation results show that this system has lower word error rate compared to ASR and VSR in different noisy conditions. Furthermore, the results of using AVSR techniques show that the recognition accuracy of the system has been improved in noisy places. Also, the results of a survey that was conducted with 100 deaf people show that more than 80 % of deaf people are very interested in using our system as an assistant in portable devices to communicate with people.

引用

页码：245 / 257

页数：13

共 50 条

[31] Speaker independent audio-visual continuous speech recognition
Liang, LH
Liu, XX
Zhao, YB
Pi, XB
Nefian, AV
[J]. IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A25 - A28
[32] Audio-Visual Speech Recognition in the Presence of a Competing Speaker
Shao, Xu
Barker, Jon
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1292 - 1295
[33] Audio-Visual Automatic Speech Recognition for Connected Digits
Wang, Xiaoping
Hao, Yufeng
Fu, Degang
Yuan, Chunwei
[J]. 2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL III, PROCEEDINGS, 2008, : 328 - +
[34] DARE: Deceiving Audio-Visual speech Recognition model
Mishra, Saumya
Gupta, Anup Kumar
Gupta, Puneet
[J]. KNOWLEDGE-BASED SYSTEMS, 2021, 232
[35] Multistage information fusion for audio-visual speech recognition
Chu, SM
Libal, V
Marcheret, E
Neti, C
Potamianos, G
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1651 - 1654
[36] DEEP MULTIMODAL LEARNING FOR AUDIO-VISUAL SPEECH RECOGNITION
Mroueh, Youssef
Marcheret, Etienne
Goel, Vaibhava
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2130 - 2134
[37] Relevant feature selection for audio-visual speech recognition
Drugman, Thomas
Gurban, Mihai
Thiran, Jean-Philippe
[J]. 2007 IEEE NINTH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2007, : 179 - +
[38] Weighting schemes for audio-visual fusion in speech recognition
Glotin, H
Vergyri, D
Neti, C
Potamianos, G
Luettin, J
[J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 173 - 176
[39] Dynamic Bayesian Networks for Audio-Visual Speech Recognition
Ara V. Nefian
Luhong Liang
Xiaobo Pi
Xiaoxing Liu
Kevin Murphy
[J]. EURASIP Journal on Advances in Signal Processing, 2002
[40] Connectionism based audio-visual speech recognition method
Che, Na
Zhu, Yi-Ming
Zhao, Jian
Sun, Lei
Shi, Li-Juan
Zeng, Xian-Wei
[J]. Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2024, 54 (10): : 2984 - 2993

← 1 2 3 4 5 →