Lip-reading via a DNN-HMM Hybrid System Using Combination of The Image-based and Model-based Features

被引:0
|
作者
Rahmani, Mohammad Hasan [1 ]
Almasganj, Farshad [1 ]
机构
[1] Amirkabir Univ Technol, Tehran Polytech, Biomed Engn Dept, Tehran, Iran
关键词
lip-reading; feature extraction; deep auto-encoder; DBNF; NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Introducing features that better represent the visual information of speakers during the speech production is still an open issue that highly affects the quality of the lip-reading and Audio Visual Speech Recognition (AVSR) tasks. In this paper, three different types of visual features from both the image-based and model-based ones are investigated inside a professional lip reading task. The simple raw gray level information of the lips Region of Interest (ROI), the geometric representation of lips shape and the Deep Bottle-neck Features (DBNFs) extracted from a 6-layer Deep Auto-encoder Neural Network (DANN) are three valuable feature sets compared while employed for the lip reading purpose. Two different recognition systems, including the conventional GMM-HMM and the state-of-the-art DNN-HMM hybrid, are utilized to perform an isolated and connected digit recognition task. The results indicate that the high level information extracted from deep layers of the lips ROI can represent the visual modality with advantage of "high amount of information in a low dimension feature vector". Moreover, the DBNFs showed a relative improvement with an average of 15.4% in comparison to the shape features and the shape features showed a relative improvement with an average of 20.4% in comparison to the ROI features over the test data.
引用
下载
收藏
页码:195 / 199
页数:5
相关论文
共 50 条
  • [21] Image-based quantification of histological features as a function of spatial location using the Tissue Positioning System
    Rong, Ruichen
    Wei, Yonglong
    Li, Lin
    Wang, Tao
    Zhu, Hao
    Xiao, Guanghua
    Wang, Yunguan
    EBIOMEDICINE, 2023, 94
  • [22] Medical Image-Based Diagnosis Using a Hybrid Adaptive Neuro-Fuzzy Inferences System (ANFIS) Optimized by GA with a Deep Network Model for Features Extraction
    Rashed, Baidaa Mutasher
    Popescu, Nirvana
    MATHEMATICS, 2024, 12 (05)
  • [23] Improvements to image quality using hybrid and model-based iterative reconstructions: a phantom study
    Aurumskjold, Marie-Louise
    Ydstrom, Kristina
    Tingberg, Anders
    Soderberg, Marcus
    ACTA RADIOLOGICA, 2017, 58 (01) : 53 - 61
  • [24] A novel approach for image-based olive leaf diseases classification using a deep hybrid model
    El Akhal, Hicham
    Ben Yahya, Aissa
    Moussa, Noureddine
    El Alaouil, Abdelbaki El Belrhiti
    ECOLOGICAL INFORMATICS, 2023, 77
  • [25] iCAP: An Individualized Model Combining Gaze Parameters and Image-Based Features to Predict Radiologists' Decisions While Reading Mammograms
    Gandomkar, Ziba
    Tay, Kevin
    Ryder, Will
    Brennan, Patrick C.
    Mello-Thoms, Claudia
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2017, 36 (05) : 1066 - 1075
  • [26] MRF model-based approach for image segmentation using a Chaotic MultiAgent System
    Melkemi, KE
    Batouche, M
    Foufou, S
    FUZZY LOGIC AND APPLICATIONS, 2006, 3849 : 344 - 353
  • [27] Efficient eddy current characterization using a 2D image-based sampling scheme and a model-based fitting approach
    Schwerter, Michael
    Zimmermann, Markus
    Felder, Joerg
    Shah, N. Jon
    MAGNETIC RESONANCE IN MEDICINE, 2021, 85 (05) : 2892 - 2903
  • [28] Image-based education using gaze movements of conductors via eye-tracking system
    Higuchi, Takashi
    Goto, Daiju
    Kobayashi, None
    Hayashi, Asuka
    Japanese Railway Engineering, 2019, 2019-July (205): : 16 - 18
  • [29] A hybrid model-based image coding system for very low bit-rate coding
    Li, YC
    Chen, YC
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1998, 16 (01) : 28 - 41
  • [30] ECG arrhythmia recognition via a neuro-SVM-KNN hybrid classifier with virtual QRS image-based geometrical features
    Homaeinezhad, M. R.
    Atyabi, S. A.
    Tavakkoli, E.
    Toosi, H. N.
    Ghaffari, A.
    Ebrahimpour, R.
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (02) : 2047 - 2058