Visual Speech Recognition Using Optical Flow and Hidden Markov Model

被引:0
|
作者
Usha Sharma
Sushila Maheshkar
A. N. Mishra
Rahul Kaushik
机构
[1] Indian Institute of Technology (Indian School of Mines),Department of Computer Science and Engineering
[2] National Institute of Technology,Department of Computer Science and Engineering
[3] Krishna Engineering College,Department of Electronics and Communication Engineering
[4] Jaypee Institute of Information Technology,Department of Electronics and Communication Engineering
来源
关键词
Automatic speech recognition; Audio-visual speech recognition; Optical flow; Hidden Markov model;
D O I
暂无
中图分类号
学科分类号
摘要
The present work proposes audio-visual speech recognition with the use of Gammatone frequency cepstral coefficient (GFCC) and optical flow (OF) features with Hindi speech database. The OF refers to the distribution of apparent velocities of brightness pattern movements in an image. In this technique, OF is determined without extracting the location and contours of pair of lips of individual speaker. The visual features as horizontal component and vertical components of flow velocities have been calculated. Furthermore, the visual features are combined with audio features using early integration method followed by classification using hidden Markov model. The isolated Hindi digits were evaluated for their recognition performance using GFCC features not only in clean environment but also tested under noisy environment and compared with existing Mel frequency cepstral coefficient (MFCC) features. The GFCC shows almost comparable result with MFCC in clean environment; however, its performance goes down in noisy environment. Futhermore, the visual features obtained by the OF analysis when combine with GFCC audio features give significant improvement of ~ 12%, ~ 12%, and ~ 14% at different SNRs (5 dB, 10 dB, and 20 dB, respectively) in recognition performance under noisy environment.
引用
收藏
页码:2129 / 2147
页数:18
相关论文
共 50 条
  • [1] Visual Speech Recognition Using Optical Flow and Hidden Markov Model
    Sharma, Usha
    Maheshkar, Sushila
    Mishra, A. N.
    Kaushik, Rahul
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2019, 106 (04) : 2129 - 2147
  • [2] Recognition of visual speech elements using Hidden Markov Models
    Foo, SW
    Dong, L
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2002, PROCEEDING, 2002, 2532 : 607 - 614
  • [3] Murmured Speech Recognition Using Hidden Markov Model
    Kumar, Rajesh T.
    Videla, Lakshmi Sarvani
    SivaKumar, Soubraylu
    Asalg, Gopala Gupta
    Haritha, D.
    [J]. 2020 7TH IEEE INTERNATIONAL CONFERENCE ON SMART STRUCTURES AND SYSTEMS (ICSSS 2020), 2020, : 53 - 57
  • [4] Visual speech recognition using motion features and hidden Markov models
    Yau, Wai Chee
    Kumar, Dinesh Kant
    Weghorn, Hans
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, PROCEEDINGS, 2007, 4673 : 832 - 839
  • [5] Automatic Urdu Speech Recognition Using Hidden Markov Model
    Asadullah
    Shaukat, Arslan
    Ali, Hazrat
    Akram, Usman
    [J]. 2016 INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC 2016), 2016, : 135 - 139
  • [6] Speech recognition of monosyllables using hidden Markov model in VHDL
    Vaidhyanathan, A
    Lakshmiprabha, V
    [J]. TENCON 2004 - 2004 IEEE REGION 10 CONFERENCE, VOLS A-D, PROCEEDINGS: ANALOG AND DIGITAL TECHNIQUES IN ELECTRICAL ENGINEERING, 2004, : A76 - A79
  • [7] Belief Hidden Markov Model for Speech Recognition
    Jendoubi, Siwar
    Ben Yaghlane, Boutheina
    Martin, Arnaud
    [J]. 2013 5TH INTERNATIONAL CONFERENCE ON MODELING, SIMULATION AND APPLIED OPTIMIZATION (ICMSAO), 2013,
  • [8] Visual speech recognition using Active Shape Models and Hidden Markov Models
    Luettin, J
    Thacker, NA
    Beet, SW
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 817 - 820
  • [9] Recognition of visual speech elements using adaptively boosted hidden Markov models
    Foo, SW
    Lian, Y
    Dong, L
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2004, 14 (05) : 693 - 705
  • [10] Speech Recognition for English to Indonesian Translator Using Hidden Markov Model
    Muhammad, Hariz Zakka
    Nasrun, Muhammad
    Setianingsih, Casi
    Murti, Muhammad Ary
    [J]. 2018 INTERNATIONAL CONFERENCE ON SIGNALS AND SYSTEMS (ICSIGSYS), 2018, : 255 - 260