A Scalable Approach to Using DNN-Derived Features in GMM-HMM Based Acoustic Modeling For LVCSR

被引:0
|
作者
Yan, Zhi-Jie [1 ]
Huo, Qiang [1 ]
Xu, Jian [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
关键词
deep neural network; DNN-based feature extraction; DNN-GMM-HMM; DNN-HMM; LVCSR;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new scalable approach to using deep neural network (DNN) derived features in Gaussian mixture density hidden Markov model (GMM-HMM) based acoustic modeling for large vocabulary continuous speech recognition (LVCSR). The DNN-based feature extractor is trained from a subset of training data to mitigate the scalability issue of DNN training, while GMM-HMMs are trained by using state-of-the-art scalable training methods and tools to leverage the whole training set. In a benchmark evaluation, we used 309-hour Switchboard I (SWB) training data to train a DNN first, which achieves a word error rate (WER) of 15.4% on MST-2000 HubS evaluation set by a traditional DNN-HMM based approach. When the same DNN is used as a feature extractor and 2,000-hour "SWB+Fisher" training data is used to train the GMM-HMMs, our DNN-GMM-HMM approach achieves a WER of 13.8%. If per-conversation-side based unsupervised adaptation is performed, a WER of 13.1% can be achieved.
引用
收藏
页码:104 / 108
页数:5
相关论文
共 50 条
  • [1] AN INVESTIGATION ON DNN-DERIVED BOTTLENECK FEATURES FOR GMM-HMM BASED ROBUST SPEECH RECOGNITION
    You, Yongbin
    Qian, Yanmin
    He, Tianxing
    Yu, Kai
    [J]. 2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 30 - 34
  • [2] Blood Pressure Estimation Using Time Domain Features of Auscultatory Waveforms and GMM-HMM Classification Approach
    Celler, Branko G.
    Le, Phu N.
    Argha, Ahmadreza
    Ambikairajah, Eliathamby
    [J]. 2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2019, : 208 - 211
  • [3] Phoneme and Word Based Model for Tamil Speech Recognition using GMM-HMM
    Karpagavalli, S.
    Chandra, E.
    [J]. ICACCS 2015 PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS, 2015,
  • [4] CSI-Based Human Continuous Activity Recognition Using GMM-HMM
    Cheng, Xiaoyan
    Huang, Binke
    [J]. IEEE SENSORS JOURNAL, 2022, 22 (19) : 18709 - 18717
  • [5] A Fault Diagnosis Approach for Rolling Bearing Based on Wavelet Packet Decomposition and GMM-HMM
    Huang, Liangpei
    Huang, Hua
    Liu, Yonghua
    [J]. INTERNATIONAL JOURNAL OF ACOUSTICS AND VIBRATION, 2019, 24 (02): : 199 - 209
  • [6] Comparison of acoustical models of GMM-HMM based for speech recognition in Hindi using PocketSphinx
    Manasa, Chadalavada Sai
    Priya, K. Jeeva
    Gupta, Deepa
    [J]. PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2019), 2019, : 534 - 539
  • [7] sEMG-Based Continuous Hand Gesture Recognition Using GMM-HMM and Threshold Model
    Yang, Jinxing
    Pan, Jianhong
    Li, Jun
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE ROBIO 2017), 2017, : 1509 - 1514
  • [8] Device-Free Human Activity Recognition Based on GMM-HMM Using Channel State Information
    Cheng, Xiaoyan
    Huang, Binke
    Zong, Jing
    [J]. IEEE ACCESS, 2021, 9 : 76592 - 76601
  • [9] A USEFUL FEATURE-ENGINEERING APPROACH FOR A LVCSR SYSTEM BASED ON CD-DNN-HMM ALGORITHM
    Lee, Sung Joo
    Kang, Byung Ok
    Chung, Hoon
    Park, Jeon Gue
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1421 - 1425
  • [10] 2-D Structure-Based Gait Recognition in Video Using Incremental GMM-HMM
    Pu, Rui
    Wang, Yunhong
    [J]. COMPUTER VISION - ACCV 2014 WORKSHOPS, PT I, 2015, 9008 : 58 - 70