Boundary Descriptors for Visual Speech Recognition

被引:0
|
作者
Gupta, Deepika [1 ]
Singh, Preety [1 ]
Laxmi, V. [1 ]
Gaur, Manoj S. [1 ]
机构
[1] Malaviya Natl Inst Technol, Dept Comp Engn, Jaipur, Rajasthan, India
来源
COMPUTER AND INFORMATION SCIENCES II | 2012年
关键词
D O I
10.1007/978-1-4471-2155-8_39
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Lip reading has attracted considerable research interest for improved performance of automatic speech recognition (Rabiner, L., Juang, B.: Fundamentals of speech recognition. Prentice Hall, New Jersey (1993)). The key issue in visual speech recognition is the representation of the information from speech articulators as a feature vector. In this paper, we define the lips using lip contour spatial coordinates as boundary descriptors. Traditionally, Principal Component Analysis (PCA), Discrete Cosine Transform (DCT) and Discrete Fourier Transform (DFT) techniques are applied on pixels from images of the mouth. In our paper, we apply PCA on spatial points for data reduction. DCT and DFT are applied directly on the boundary descriptors to transform these spatial coordinates into the frequency domain. The new spatial and frequency domain feature vectors are used to classify the spoken word. Accuracy of 53.4% is obtained in the spatial domain and 54.3% in the frequency domain which is comparable to results reported in literature.
引用
收藏
页码:307 / 313
页数:7
相关论文
共 50 条
  • [21] Adding Cues to Binary Feature Descriptors for Visual Place Recognition
    Schlegel, Dominik
    Grisetti, Giorgio
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 5488 - 5494
  • [22] Action Recognition Using Local Visual Descriptors and Inertial Data
    Alhersh, Taha
    Belhaouari, Samir Brahim
    Stuckenschmidt, Heiner
    AMBIENT INTELLIGENCE (AMI 2019), 2019, 11912 : 123 - 138
  • [23] Visual descriptors based Real Time Hand Gesture Recognition
    Kakkoth, Sarang Suresh
    Gharge, Saylee
    2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMMUNICATION AND COMPUTING TECHNOLOGY (ICACCT), 2018, : 361 - 367
  • [24] An evaluation of visual speech features for the tasks of speech and speaker recognition
    Lucey, S
    AUDIO-BASED AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 260 - 267
  • [25] Audio-Visual Speech Modeling for Continuous Speech Recognition
    Dupont, Stephane
    Luettin, Juergen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2000, 2 (03) : 141 - 151
  • [26] Effect of Various Visual Speech Units on Language Identification Using Visual Speech Recognition
    Brahme, Aparna
    Bhadade, Umesh
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2020, 20 (04)
  • [27] VISUAL SALIENT SIFT KEYPOINTS DESCRIPTORS FOR AUTOMATIC TARGET RECOGNITION
    Karine, Ayoub
    Toumi, Abdelmalek
    Khenchaf, Ali
    El Hassouni, Mohammed
    PROCEEDINGS OF THE 2016 6TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP), 2016,
  • [28] Semantic Descriptors into Representation for Robust Indoor Visual Place Recognition
    Kim, Nuni
    Kang, Minjae
    Oh, Songhwai
    2021 21ST INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2021), 2021, : 715 - 718
  • [29] Dynamic visual features based on discriminative speech class projection for visual speech recognition
    Lei, X
    Cai, XL
    Fu, ZH
    Zhao, RC
    PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2004, : 687 - 690
  • [30] Analysis and Determination of Inner Lip Texture Descriptors for Visual Speech Representation
    Jia, Xibin
    Du, Hua
    Han, Yanfang
    Powers, David M. W.
    JOURNAL OF COMPUTERS, 2014, 9 (07) : 1628 - 1638