Integration of Articulatory Knowledge and Voicing Features Based on DNN/HMM for Mandarin Speech Recognition

被引:0
|
作者
Tan, Ying-Wei [1 ]
Liu, Wen-Ju [1 ]
Jiang, Wei [1 ]
Zheng, Hao [1 ]
机构
[1] Chinese Acad Sci, Inst Automation, Dept Natl Lab Pattern Recognit, Beijing 100864, Peoples R China
关键词
MARKOV-MODELS; ACOUSTICS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech production knowledge has been used to enhance the phonetic representation and the performance of automatic speech recognition (ASR) systems successfully. Representations of speech production make simple explanations for many phenomena observed in speech. These phenomena can not be easily analyzed from either acoustic signal or phonetic transcription alone. One of the most important aspects of speech production knowledge is the use of articulatory knowledge, which describes the smooth and continuous movements in the vocal tract. In this paper, we present a new articulatory model to provide available information for rescoring the speech recognition lattice hypothesis. The articulatory model consists of a feature front-end, which computes a voicing feature based on a spectral harmonics correlation (SHC) function, and a back-end based on the combination of deep neural networks (DNNs) and hidden Markov models (HMMs). The voicing features are incorporated with standard Mel frequency cepstral coefficients (MFCCs) using heteroscedastic linear discriminant analysis (HLDA) to compensate the speech recognition accuracy rates. Moreover, the advantages of two different models are taken into account by the algorithm, which retains deep learning properties of DNNs, while modeling the articulatory context powerfully through HMMs. Mandarin speech recognition experiments show the proposed method achieves significant improvements in speech recognition performance over the system using MFCCs alone.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Use of voicing features in HMM-based speech recognition
    Thomson, DL
    Chengalvarayan, R
    [J]. SPEECH COMMUNICATION, 2002, 37 (3-4) : 197 - 211
  • [2] Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features
    Lin, Ju
    Xie, Yanlu
    Gao, Yingming
    Zhang, Jinsong
    [J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [3] Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks
    Ju Lin
    Wei Li
    Yingming Gao
    Yanlu Xie
    Nancy F. Chen
    Sabato Marco Siniscalchi
    Jinsong Zhang
    Chin-Hui Lee
    [J]. Journal of Signal Processing Systems, 2018, 90 : 1077 - 1087
  • [4] Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks
    Lin, Ju
    Li, Wei
    Gao, Yingming
    Xie, Yanlu
    Chen, Nancy F.
    Siniscalchi, Sabato Marco
    Zhang, Jinsong
    Lee, Chin-Hui
    [J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2018, 90 (07): : 1077 - 1087
  • [5] IMPROVED TONE MODELING BY EXPLOITING ARTICULATORY FEATURES FOR MANDARIN SPEECH RECOGNITION
    Chao, Hao
    Yang, Zhanlei
    Liu, Wenju
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4741 - 4744
  • [6] Incorporating the voicing information into HMM-based automatic speech recognition
    Jancovic, Peter
    Koekueer, Muenevver
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 42 - 46
  • [7] An HMM-based speech recognizer using overlapping articulatory features
    Erler, K
    Freeman, GH
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (04): : 2500 - 2513
  • [8] Integrating Articulatory Features Into HMM-Based Parametric Speech Synthesis
    Ling, Zhen-Hua
    Richmond, Korin
    Yamagishi, Junichi
    Wang, Ren-Hua
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (06): : 1171 - 1185
  • [9] SPEECH SYNTHESIS USING ARTICULATORY-KNOWLEDGE BASED HMM STRUCTURE
    Gu, Hung-Yan
    Lai, Ming-Yen
    Hong, Wei-Siang
    [J]. PROCEEDINGS OF 2014 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2014, : 371 - 376
  • [10] AN INVESTIGATION ON DNN-DERIVED BOTTLENECK FEATURES FOR GMM-HMM BASED ROBUST SPEECH RECOGNITION
    You, Yongbin
    Qian, Yanmin
    He, Tianxing
    Yu, Kai
    [J]. 2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 30 - 34