Speaker-Independent Speech Recognition using Visual Features

被引：0

作者：

Pooventhiran, G. ^{[1
]}

Sandeep, A. ^{[1
]}

Manthiravalli, K. ^{[1
]}

Harish, D. ^{[1
]}

Renuka, Karthika D. ^{[1
]}

机构：

[1] PSG Coll Technol, Dept Informat Technol, Coimbatore 641004, Tamil Nadu, India

来源：

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS | 2020年 / 11卷 / 11期

关键词：

Visual speech recognition; audio speech recognition; visemes; lip reading system; Convolutional Neural Network (CNN);

D O I：

10.14569/IJACSA.2020.0111175

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Visual Speech Recognition aims at transcribing lip movements into readable text. There have been many strides in automatic speech recognition systems that can recognize words with audio and visual speech features, even under noisy conditions. This paper focuses only on the visual features, while a robust system uses visual features to support acoustic features. We propose the concatenation of visemes (lip movements) for text classification rather than a classic individual viseme mapping. The result shows that this approach achieves a significant improvement over the state-of-the-art models. The system has two modules; the first one extracts lip features from the input video, while the next is a neural network system trained to process the viseme sequence and classify it as text.

引用

页码：616 / 620

页数：5

共 50 条

[21] Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition
Itoh, Arata
Hara, Sunao
Kitaoka, Norihide
Takeda, Kazuya
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (10): : 2479 - 2485
[22] SPEAKER-INDEPENDENT WORD RECOGNITION IN CONNECTED SPEECH ON THE BASIS OF PHONEME RECOGNITION
MAENOBU, K
ARIKI, Y
SAKAI, T
INFORMATION SCIENCES, 1984, 33 (1-2) : 31 - 61
[23] Across-speaker Articulatory Normalization for Speaker-independent Silent Speech Recognition
Wang, Jun
Samal, Ashok
Green, Jordan R.
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1179 - 1183
[24] An automatic speech recognition system with speaker-independent identification support
Caranica, Alexandru
Burileanu, Corneliu
ADVANCED TOPICS IN OPTOELECTRONICS, MICROELECTRONICS, AND NANOTECHNOLOGIES VII, 2015, 9258
[25] ON LARGE-VOCABULARY SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNITION
LEE, KF
SPEECH COMMUNICATION, 1988, 7 (04) : 375 - 379
[26] Speaker-independent telephone speech recognition system: the VCS TeleRec
Hunt, Alan
Speech technology, 1988, 4 (02): : 80 - 82
[27] A SPEAKER-INDEPENDENT SPEECH RECOGNITION SYSTEM FOR TELEPHONE NETWORK APPLICATIONS
TRNKA, R
REVUE TECHNIQUE THOMSON-CSF, 1984, 16 (04): : 847 - 861
[28] Articulatory and bottleneck features for speaker-independent ASR of dysarthric speech
Yilmaz, Emre
Mitra, Vikramjit
Sivaraman, Ganesh
Franco, Horacio
COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 319 - 334
[29] Speaker-independent speech recognition based on tree-structured speaker clustering
Kosaka, T
Matsunaga, S
Sagayama, S
COMPUTER SPEECH AND LANGUAGE, 1996, 10 (01): : 55 - 74
[30] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
Fahad, Md Shah
Ranjan, Ashish
Deepak, Akshay
Pradhan, Gayadhar
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (11) : 6113 - 6135

← 1 2 3 4 5 →