Boundary Descriptors for Visual Speech Recognition

被引：0

作者：

Gupta, Deepika ^{[1
]}

Singh, Preety ^{[1
]}

Laxmi, V. ^{[1
]}

Gaur, Manoj S. ^{[1
]}

机构：

[1] Malaviya Natl Inst Technol, Dept Comp Engn, Jaipur, Rajasthan, India

来源：

COMPUTER AND INFORMATION SCIENCES II | 2012年

关键词：

D O I：

10.1007/978-1-4471-2155-8_39

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Lip reading has attracted considerable research interest for improved performance of automatic speech recognition (Rabiner, L., Juang, B.: Fundamentals of speech recognition. Prentice Hall, New Jersey (1993)). The key issue in visual speech recognition is the representation of the information from speech articulators as a feature vector. In this paper, we define the lips using lip contour spatial coordinates as boundary descriptors. Traditionally, Principal Component Analysis (PCA), Discrete Cosine Transform (DCT) and Discrete Fourier Transform (DFT) techniques are applied on pixels from images of the mouth. In our paper, we apply PCA on spatial points for data reduction. DCT and DFT are applied directly on the boundary descriptors to transform these spatial coordinates into the frequency domain. The new spatial and frequency domain feature vectors are used to classify the spoken word. Accuracy of 53.4% is obtained in the spatial domain and 54.3% in the frequency domain which is comparable to results reported in literature.

引用

页码：307 / 313

页数：7

共 50 条

[21] Adding Cues to Binary Feature Descriptors for Visual Place Recognition
Schlegel, Dominik
Grisetti, Giorgio
2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 5488 - 5494
[22] Action Recognition Using Local Visual Descriptors and Inertial Data
Alhersh, Taha
Belhaouari, Samir Brahim
Stuckenschmidt, Heiner
AMBIENT INTELLIGENCE (AMI 2019), 2019, 11912 : 123 - 138
[23] Visual descriptors based Real Time Hand Gesture Recognition
Kakkoth, Sarang Suresh
Gharge, Saylee
2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMMUNICATION AND COMPUTING TECHNOLOGY (ICACCT), 2018, : 361 - 367
[24] An evaluation of visual speech features for the tasks of speech and speaker recognition
Lucey, S
AUDIO-BASED AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 260 - 267
[25] Audio-Visual Speech Modeling for Continuous Speech Recognition
Dupont, Stephane
Luettin, Juergen
IEEE TRANSACTIONS ON MULTIMEDIA, 2000, 2 (03) : 141 - 151
[26] Effect of Various Visual Speech Units on Language Identification Using Visual Speech Recognition
Brahme, Aparna
Bhadade, Umesh
INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2020, 20 (04)
[27] VISUAL SALIENT SIFT KEYPOINTS DESCRIPTORS FOR AUTOMATIC TARGET RECOGNITION
Karine, Ayoub
Toumi, Abdelmalek
Khenchaf, Ali
El Hassouni, Mohammed
PROCEEDINGS OF THE 2016 6TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP), 2016,
[28] Semantic Descriptors into Representation for Robust Indoor Visual Place Recognition
Kim, Nuni
Kang, Minjae
Oh, Songhwai
2021 21ST INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2021), 2021, : 715 - 718
[29] Dynamic visual features based on discriminative speech class projection for visual speech recognition
Lei, X
Cai, XL
Fu, ZH
Zhao, RC
PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2004, : 687 - 690
[30] Analysis and Determination of Inner Lip Texture Descriptors for Visual Speech Representation
Jia, Xibin
Du, Hua
Han, Yanfang
Powers, David M. W.
JOURNAL OF COMPUTERS, 2014, 9 (07) : 1628 - 1638

← 1 2 3 4 5 →