Hybrid NN/HMM acoustic modeling techniques for distributed speech recognition

被引:1
|
作者
Stadermann, Jan [1 ]
Rigoll, Gerhard [1 ]
机构
[1] Tech Univ Munich, Inst Human Machine Commun, D-8000 Munich, Germany
关键词
distributed speech recognition; tied-posteriors; hybrid speech recognition;
D O I
10.1016/j.specom.2006.01.007
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Distributed speech recognition (DSR) where the recognizer is split up into two parts and connected via a transmission channel offers new perspectives for improving the speech recognition performance in mobile environments. In this work, we present the integration of hybrid acoustic models using tied-posteriors in a distributed environment. A comparison with standard Gaussian models is performed on the AURORA2 task and the WSJ0 task. Word-based HMMs and phoneme-based HMMs are trained for distributed and nod-distributed recognition using either MFCC or RASTA-PLP features. The results show that hybrid modeling techniques can outperform standard continuous systems on this task. Especially the tied-posteriors approach is shown to be usable for DSR in a very flexible way since the client can be modified without a change at the server site and vice versa. (C) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:1037 / 1046
页数:10
相关论文
共 50 条
  • [1] Hybrid HMM-NN modeling of stationary-transitional units for continuous speech recognition
    Albesano, D
    Gemello, R
    Mana, F
    [J]. PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 1112 - 1115
  • [2] Hybrid HMM-NN modeling of stationary-transitional units for continuous speech recognition
    Albesano, D
    Gemello, R
    Mana, F
    [J]. INFORMATION SCIENCES, 2000, 123 (1-2) : 3 - 11
  • [3] Hybrid modeling of PHMM and HMM for speech recognition
    Ogawa, T
    Kobayashi, T
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 140 - 143
  • [4] Comparing NN paradigms in hybrid NN/HMM speech recognition using tied posteriors
    Stadermann, J
    Rigoll, G
    [J]. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 89 - 93
  • [5] Comparison of standard and hybrid modeling techniques for distributed speech recognition
    Stadermann, J
    Rigoll, G
    [J]. ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 143 - 146
  • [6] Hybrid HMM-NN for speech recognition and prior class probabilities
    Albesano, D
    Gemello, R
    Mana, F
    [J]. ICONIP'02: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING: COMPUTATIONAL INTELLIGENCE FOR THE E-AGE, 2002, : 2391 - 2395
  • [7] HMM/NN hybrids for continuous speech recognition
    Alim, OAA
    Elboghdadly, N
    El Shaar, NM
    [J]. PROCEEDINGS OF THE EIGHTEENTH NATIONAL RADIO SCIENCE CONFERENCE, VOLS 1 AND 2, 2001, : 509 - 516
  • [8] A hybrid HMM/BN acoustic model for automatic speech recognition
    Markov, K
    Nakamura, S
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2003, E86D (03): : 438 - 445
  • [9] Applying Batch Normalization to Hybrid NN-HMM Model For Speech Recognition
    Zhan, Hongjian
    Chen, Guilin
    Lu, Yue
    [J]. PATTERN RECOGNITION (CCPR 2016), PT II, 2016, 663 : 427 - 435
  • [10] A NN/HMM hybrid for continuous speech recognition with a discriminant nonlinear feature extraction
    Rigoll, G
    Willett, D
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 9 - 12