Extraction of Features for Lip-reading Using Autoencoders

被引:0
|
作者
Palecek, Karel [1 ]
机构
[1] Tech Univ Liberec, Inst Informat Technol & Elect, Liberec 46117, Czech Republic
来源
SPEECH AND COMPUTER | 2014年 / 8773卷
关键词
Autoencoder; Hidden Markov Model; Kinect; Lip-reading;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the incorporation of facial depth data in the task of isolated word visual speech recognition. We propose novel features based on unsupervised training of a single layer autoencoder. The features are extracted from both video and depth channels obtained by Microsoft Kinect device. We perform all experiments on our database of 54 speakers, each uttering 50 words. We compare our autoencoder features to traditional methods such as DCT or PCA. The features are further processed by simplified variant of hierarchical linear discriminant analysis in order to capture the speech dynamics. The classification is performed using a multi-stream Hidden Markov Model for various combinations of audio, video, and depth channels. We also evaluate visual features in the join audio-video isolated word recognition in noisy environments. English
引用
收藏
页码:209 / 216
页数:8
相关论文
共 50 条
  • [1] ROI Processing for Visual Features Extraction in Lip-reading
    Wang, Xiaoping
    Hao, Yufeng
    Fu, Degang
    Yuan, Chunwei
    2008 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND SIGNAL PROCESSING, VOLS 1 AND 2, 2007, : 178 - +
  • [2] Automatic lip localization and feature extraction for lip-reading
    Werda, Salah
    Mahdi, Walid
    Ben Hamadou, Abdehnajid
    VISAPP 2007: PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOLUME IU/MTSV, 2007, : 268 - +
  • [3] LIP-READING
    Lindquist, Ida P.
    VOLTA REVIEW, 1917, 19 (04) : 188 - 188
  • [4] Extraction of frame-difference features based on PCA and ICA for lip-reading
    Lee, KD
    Lee, MJ
    Lee, SY
    Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vols 1-5, 2005, : 232 - 237
  • [5] A Novel Motion Based Lip Feature Extraction for Lip-reading
    Li, Meng
    Cheung, Yiu-ming
    2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, VOLS 1 AND 2, PROCEEDINGS, 2008, : 361 - 365
  • [6] LIP-READING
    Naber, Joseph E.
    VOLTA REVIEW, 1920, 22 (08) : 527 - 528
  • [7] LIP-READING
    Wilson, Ida H.
    VOLTA REVIEW, 1920, 22 (04) : 221 - 222
  • [8] LIP-READING
    不详
    VOLTA REVIEW, 1919, 21 (12) : 800 - 800
  • [9] LIP-READING
    Wadleigh, Grace K.
    VOLTA REVIEW, 1921, 23 (01) : 46 - 47
  • [10] Lip-Reading using Neural Networks
    Bagai, Abhay
    Gandhi, Harsh
    Goyal, Rahul
    Kohli, Maitrei
    Prasad, T. V.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2009, 9 (04): : 108 - 111