Speaker-Independent Speech Recognition using Visual Features

被引：0

作者：

Pooventhiran, G. ^{[1
]}

Sandeep, A. ^{[1
]}

Manthiravalli, K. ^{[1
]}

Harish, D. ^{[1
]}

Renuka, Karthika D. ^{[1
]}

机构：

[1] PSG Coll Technol, Dept Informat Technol, Coimbatore 641004, Tamil Nadu, India

来源：

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS | 2020年 / 11卷 / 11期

关键词：

Visual speech recognition; audio speech recognition; visemes; lip reading system; Convolutional Neural Network (CNN);

D O I：

10.14569/IJACSA.2020.0111175

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Visual Speech Recognition aims at transcribing lip movements into readable text. There have been many strides in automatic speech recognition systems that can recognize words with audio and visual speech features, even under noisy conditions. This paper focuses only on the visual features, while a robust system uses visual features to support acoustic features. We propose the concatenation of visemes (lip movements) for text classification rather than a classic individual viseme mapping. The result shows that this approach achieves a significant improvement over the state-of-the-art models. The system has two modules; the first one extracts lip features from the input video, while the next is a neural network system trained to process the viseme sequence and classify it as text.

引用

下载

页码：616 / 620

页数：5

共 50 条

[1] SPEAKER-INDEPENDENT ISOLATED WORD RECOGNITION USING DYNAMIC FEATURES OF SPEECH SPECTRUM
FURUI, S
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1986, 34 (01): : 52 - 59
[2] Biomimetic pattern recognition for speaker-independent speech recognition
Qin, H
Wang, SJ
Sun, H
PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 1290 - 1294
[3] Predictor codebook for speaker-independent speech recognition
Kawabata, Takeshi
Systems and Computers in Japan, 1994, 25 (01): : 37 - 46
[4] SPEAKER-INDEPENDENT VOWEL RECOGNITION IN PERSIAN SPEECH
Nazari, Mohammad
Sayadiyan, Abolghasem
Valiollahzadeh, Seyyed Majid
2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 672 - 676
[5] PREDICTOR CODEBOOK FOR SPEAKER-INDEPENDENT SPEECH RECOGNITION
KAWABATA, T
SYSTEMS AND COMPUTERS IN JAPAN, 1994, 25 (01) : 37 - 46
[6] Japanese Speaker-Independent Homonyms Speech Recognition
Murakami, Jin'ichi
Hotta, Haseo
COMPUTATIONAL LINGUISTICS AND RELATED FIELDS, 2011, 27 : 306 - 313
[7] SPEAKER-INDEPENDENT VISUAL SPEECH RECOGNITION WITH THE INCEPTION V3 MODEL
Santos, Timothy Israel
Abel, Andrew
Wilson, Nick
Xu, Yan
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 613 - 620
[8] Practical speaker-independent voice recognition using segmental features
Kimura, T
Ashida, A
Niyada, K
ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2004, 87 (02): : 73 - 81
[9] Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features
Qi-rong MAO
Xiao-lei ZHAO
Zheng-wei HUANG
Yong-zhao ZHAN
Frontiers of Information Technology & Electronic Engineering, 2013, 14 (07) : 573 - 582
[10] A speaker-independent continuous speech recognition system using biomimetic pattern recognition
Wang Shoujue
Qin Hong
CHINESE JOURNAL OF ELECTRONICS, 2006, 15 (03): : 460 - 462

← 1 2 3 4 5 →