Visual Speech Recognition Using Optical Flow and Hidden Markov Model

被引:0
|
作者
Usha Sharma
Sushila Maheshkar
A. N. Mishra
Rahul Kaushik
机构
[1] Indian Institute of Technology (Indian School of Mines),Department of Computer Science and Engineering
[2] National Institute of Technology,Department of Computer Science and Engineering
[3] Krishna Engineering College,Department of Electronics and Communication Engineering
[4] Jaypee Institute of Information Technology,Department of Electronics and Communication Engineering
来源
关键词
Automatic speech recognition; Audio-visual speech recognition; Optical flow; Hidden Markov model;
D O I
暂无
中图分类号
学科分类号
摘要
The present work proposes audio-visual speech recognition with the use of Gammatone frequency cepstral coefficient (GFCC) and optical flow (OF) features with Hindi speech database. The OF refers to the distribution of apparent velocities of brightness pattern movements in an image. In this technique, OF is determined without extracting the location and contours of pair of lips of individual speaker. The visual features as horizontal component and vertical components of flow velocities have been calculated. Furthermore, the visual features are combined with audio features using early integration method followed by classification using hidden Markov model. The isolated Hindi digits were evaluated for their recognition performance using GFCC features not only in clean environment but also tested under noisy environment and compared with existing Mel frequency cepstral coefficient (MFCC) features. The GFCC shows almost comparable result with MFCC in clean environment; however, its performance goes down in noisy environment. Futhermore, the visual features obtained by the OF analysis when combine with GFCC audio features give significant improvement of ~ 12%, ~ 12%, and ~ 14% at different SNRs (5 dB, 10 dB, and 20 dB, respectively) in recognition performance under noisy environment.
引用
收藏
页码:2129 / 2147
页数:18
相关论文
共 50 条
  • [31] Question Answering System with Hidden Markov Model Speech Recognition
    Ho, Hobert
    Mawardi, Viny Christanti
    Dharmawan, Agus Budi
    [J]. 2017 3RD INTERNATIONAL CONFERENCE ON SCIENCE IN INFORMATION TECHNOLOGY (ICSITECH), 2017, : 257 - 262
  • [32] REPRESENTATION OF HIDDEN MARKOV MODEL FOR NOISE ADAPTIVE SPEECH RECOGNITION
    LEE, LM
    WANG, HC
    [J]. ELECTRONICS LETTERS, 1995, 31 (08) : 616 - 617
  • [33] Improved hidden Markov model for speech recognition and POS tagging
    袁里驰
    [J]. Journal of Central South University, 2012, 19 (02) : 511 - 516
  • [34] Improved hidden Markov model for speech recognition and POS tagging
    Yuan Li-chi
    [J]. JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2012, 19 (02) : 511 - 516
  • [35] Improved hidden Markov model for speech recognition and POS tagging
    Li-chi Yuan
    [J]. Journal of Central South University, 2012, 19 : 511 - 516
  • [36] Hidden Markov model-based speech emotion recognition
    Schuller, B
    Rigoll, G
    Lang, M
    [J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 401 - 404
  • [37] HIDDEN MARKOV MODELS IN SPEECH RECOGNITION
    Krajcovic, J.
    Hrncar, M.
    Muzikarova, E.
    [J]. ADVANCES IN ELECTRICAL AND ELECTRONIC ENGINEERING, 2008, 7 (1-2) : 250 - 252
  • [38] Large Vocabulary Continuous Speech Recognition using Associative Memory and Hidden Markov Model
    Kayikci, Zoehre Kara
    Palm, Guenter
    [J]. PROCEEDINGS OF THE 8TH WSEAS INTERNATIONAL CONFERENCE ON SIGNAL, SPEECH AND IMAGE PROCESSING (SSIP '08), 2008, : 61 - 66
  • [39] Continuous Malayalam Speech Recognition Using Hidden Markov Models
    Mohamed, Anuj
    Nair, K. N. Ramachandran
    [J]. PROCEEDINGS OF THE FIRST AMRITA ACM-W CELEBRATION OF WOMEN IN COMPUTING IN INDIA (A2WIC), 2010,
  • [40] Isolated Malay speech recognition using Hidden Markov Models
    Rosdi, Fadhilah
    Ainon, Raja N.
    [J]. 2008 INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING, VOLS 1-3, 2008, : 721 - 725