Comparing NN paradigms in hybrid NN/HMM speech recognition using tied posteriors

被引:0
|
作者
Stadermann, J [1 ]
Rigoll, G [1 ]
机构
[1] Tech Univ Munich, Inst Human Machine Commun, D-80290 Munich, Germany
来源
ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03 | 2003年
关键词
D O I
10.1109/ASRU.2003.1318409
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hybrid NN/HAM acoustic modeling is nowadays an established alternative approach in automatic speech recognition technology. A comparison of feed-forward and recurrent neural network topologies integrated in the tied-posteriors framework is presented. We give some insights in the training process of the networks estimating class posterior probabilities and show how the net's quality can be determined by introducing a new measurement prior to evaluating the complete ASR system. Finally we demonstrate the flexibility of the tied-posteriors framework by showing results for different context independent and context dependent acoustic models all based on the same system structure.
引用
收藏
页码:89 / 93
页数:5
相关论文
共 50 条
  • [31] Augmenting the Discrimination Power of HMM by NN for On-Line Cursive Script Recognition
    Seung-Ho Lee
    Jin H. Kim
    Applied Intelligence, 1997, 7 : 305 - 314
  • [32] Recognition of Chinese speech using hybrid HMM/HNN models
    Jia, Ying
    Du, Limin
    Hou, Ziqiang
    International Conference on Signal Processing Proceedings, ICSP, 1998, 1 : 726 - 729
  • [33] Recognition of Chinese speech using hybrid HMM HNN models
    Jia, Y
    Du, LM
    Hou, ZQ
    ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 726 - 729
  • [34] Speech/speaker recognition using a HMM/GMM hybrid model
    Rodriguez, E
    Ruiz, B
    Garcia-Crespo, A
    Garcia, F
    AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, 1997, 1206 : 227 - 234
  • [35] Augmenting the discrimination power of HMM by NN for on-line cursive script recognition
    Lee, SH
    Kim, JH
    APPLIED INTELLIGENCE, 1997, 7 (04) : 305 - 314
  • [36] Comparison between fuzzy and NN method for speech emotion recognition
    Razak, AA
    Komiya, R
    Abidin, MIZ
    THIRD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, 2005, : 297 - 302
  • [37] RASR/NN: THE RWTH NEURAL NETWORK TOOLKIT FOR SPEECH RECOGNITION
    Wiesler, Simon
    Richard, Alexander
    Golik, Pavel
    Schlueter, Ralf
    Ney, Hermann
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [38] DIRICHLET MIXTURE MODELS OF NEURAL NET POSTERIORS FOR HMM-BASED SPEECH RECOGNITION
    Balakrishnan, V
    Sivaram, G. S. V. S.
    Khudanpur, Sanjeev
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5028 - 5031
  • [39] Robust speech recognition with selective input data to a NN classifier
    Cong, L
    Asghar, S
    INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-IV, PROCEEDINGS, 1998, : 1817 - 1824
  • [40] Noise adaptation of HMM speech recognition systems using tied-mixtures in the spectral domain
    Erell, A
    Burshtein, D
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (01): : 72 - 74