Context-dependent acoustic modeling using graphemes for large vocabulary speech recognition

被引：0

作者：

Kanthak, S ^{[1
]}

Ney, H ^{[1
]}

机构：

[1] Rhein Westfal TH Aachen, Dept Comp Sci, Lehrstuhl Informat 6, D-52056 Aachen, Germany

来源：

2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS | 2002年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper we propose to use a decision tree based on graphemic acoustic sub-word units together with phonetic questions. We also show that automatic question generation can be used to completely eliminate any manual effort. We present experimental results on four corpora with different languages, namely the Dutch ARISE corpus, the Italian EUTRANS EVAL00 evaluation corpus, the German VERBMOBIL '00 development corpus and the English North American Business '94 20k and 64k development corpora. For all experiments, the acoustic models are trained from scratch in order not to use any prior phonetic knowledge. Complete training procedures have been iterated to simulate the long optimization history used for the phonemic acoustic models. With minimal manual effort we show that for the Dutch, German and Italian corpora, the presented approach works surprisingly well and increases the word error rate by not more than 2% relative. On the English NAB task the error rate is about 20% higher compared to experiments using a pronunciation lexicon.

引用

页码：845 / 848

页数：4

共 50 条

[21] A tutorial on pronunciation modeling for large vocabulary speech recognition
Fosler-Lussier, E
TEXT- AND SPEECH-TRIGGERED INFORMATION ACCESS, 2003, 2705 : 38 - 77
[22] Prosodic Modeling in Large Vocabulary Mandarin Speech Recognition
Huang, Jui-Ting
Lee, Lin-shan
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1241 - 1244
[23] A Fast Approximate Acoustic Match for Large Vocabulary Speech Recognition
Bahl, Lalit R.
De Gennaro, Steven V.
Gopalakrishnan, P. S.
Mercer, Robert L.
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (01): : 59 - 67
[24] Building DNN acoustic models for large vocabulary speech recognition
Maas, Andrew L.
Qi, Peng
Xie, Ziang
Hannun, Awni Y.
Lengerich, Christopher T.
Jurafsky, Daniel
Ng, Andrew Y.
COMPUTER SPEECH AND LANGUAGE, 2017, 41 : 195 - 213
[25] Boosting HMM acoustic models in large vocabulary speech recognition
Meyer, C
Schramm, H
SPEECH COMMUNICATION, 2006, 48 (05) : 532 - 548
[26] Analysis of context-dependent segmental duration for automatic speech recognition
Wang, X
Pols, LCW
tenBosch, LFM
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1181 - 1184
[27] Tandem acoustic modeling in large-vocabulary recognition
Ellis, DPW
Singh, R
Sivadas, S
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 517 - 520
[28] Large Vocabulary Children's Speech Recognition with DNN-HMM and SGMM Acoustic Modeling
Giuliani, Diego
BabaAli, Bagher
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1635 - 1639
[29] Probabilistic Speaker-Class based Acoustic Modeling for Large Vocabulary Continuous Speech Recognition
Li, Xiangang
Su, Dan
Pang, Zaihu
Wu, Xihong
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1218 - 1221
[30] Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition
Pakoci, Edvin
Popovic, Branislav
Pekar, Darko
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2019, 2019

← 1 2 3 4 5 →