Context-dependent acoustic modeling using graphemes for large vocabulary speech recognition

被引:0
|
作者
Kanthak, S [1 ]
Ney, H [1 ]
机构
[1] Rhein Westfal TH Aachen, Dept Comp Sci, Lehrstuhl Informat 6, D-52056 Aachen, Germany
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we propose to use a decision tree based on graphemic acoustic sub-word units together with phonetic questions. We also show that automatic question generation can be used to completely eliminate any manual effort. We present experimental results on four corpora with different languages, namely the Dutch ARISE corpus, the Italian EUTRANS EVAL00 evaluation corpus, the German VERBMOBIL '00 development corpus and the English North American Business '94 20k and 64k development corpora. For all experiments, the acoustic models are trained from scratch in order not to use any prior phonetic knowledge. Complete training procedures have been iterated to simulate the long optimization history used for the phonemic acoustic models. With minimal manual effort we show that for the Dutch, German and Italian corpora, the presented approach works surprisingly well and increases the word error rate by not more than 2% relative. On the English NAB task the error rate is about 20% higher compared to experiments using a pronunciation lexicon.
引用
收藏
页码:845 / 848
页数:4
相关论文
共 50 条
  • [21] A tutorial on pronunciation modeling for large vocabulary speech recognition
    Fosler-Lussier, E
    TEXT- AND SPEECH-TRIGGERED INFORMATION ACCESS, 2003, 2705 : 38 - 77
  • [22] Prosodic Modeling in Large Vocabulary Mandarin Speech Recognition
    Huang, Jui-Ting
    Lee, Lin-shan
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1241 - 1244
  • [23] A Fast Approximate Acoustic Match for Large Vocabulary Speech Recognition
    Bahl, Lalit R.
    De Gennaro, Steven V.
    Gopalakrishnan, P. S.
    Mercer, Robert L.
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (01): : 59 - 67
  • [24] Building DNN acoustic models for large vocabulary speech recognition
    Maas, Andrew L.
    Qi, Peng
    Xie, Ziang
    Hannun, Awni Y.
    Lengerich, Christopher T.
    Jurafsky, Daniel
    Ng, Andrew Y.
    COMPUTER SPEECH AND LANGUAGE, 2017, 41 : 195 - 213
  • [25] Boosting HMM acoustic models in large vocabulary speech recognition
    Meyer, C
    Schramm, H
    SPEECH COMMUNICATION, 2006, 48 (05) : 532 - 548
  • [26] Analysis of context-dependent segmental duration for automatic speech recognition
    Wang, X
    Pols, LCW
    tenBosch, LFM
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1181 - 1184
  • [27] Tandem acoustic modeling in large-vocabulary recognition
    Ellis, DPW
    Singh, R
    Sivadas, S
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 517 - 520
  • [28] Large Vocabulary Children's Speech Recognition with DNN-HMM and SGMM Acoustic Modeling
    Giuliani, Diego
    BabaAli, Bagher
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1635 - 1639
  • [29] Probabilistic Speaker-Class based Acoustic Modeling for Large Vocabulary Continuous Speech Recognition
    Li, Xiangang
    Su, Dan
    Pang, Zaihu
    Wu, Xihong
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1218 - 1221
  • [30] Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition
    Pakoci, Edvin
    Popovic, Branislav
    Pekar, Darko
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2019, 2019