Context dependent initial/final acoustic modeling for continuous Chinese speech recognition

被引：0

作者：

Li, Jing ^{[1
]}

Zheng, Fang ^{[1
,2
]}

Zhang, Jiyong ^{[1
]}

Wu, Wenhu ^{[1
]}

机构：

[1] Lab. of Intelligent Technol., Dept. of Comp. Sci. and Technol., Tsinghua Univ., Beijing 100084, China

[2] Beijing d-Ear Technol. Co. Ltd., Beijing 100085, China

来源：

Qinghua Daxue Xuebao/Journal of Tsinghua University | 2004年 / 44卷 / 01期

关键词：

Decision support systems - Mathematical models;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Acoustic modeling is important for continuous Chinese speech recognition. The extended initial/final (XIF) set is the basic speech recognition unit set to analyze the Chinese language characteristics outperformed the standard IF set. Decision tree-based state tying technology was used to construct the content dependent initial/final acoustic model (Tri-XIF model), with an appropriate question set design based on Chinese linguistic knowledge. The methods were developed to optimize the Tri-XIF modeling, including transcription refinement, question set extension, and model size reduction. Tests show that the Tri-XIF modeling is much better than either Tri-phone modeling or syllable modeling, the syllable error rate reduced by 24.53% comparing to the Tri-phone model and 41.65% to syllable model. The model size was reduced by above 20% with little performance deterioration using the methods in the Tri-XIF model.

引用

页码：61 / 64

共 50 条

[1] Context Dependent Syllable Acoustic Model for Continuous Chinese Speech Recognition
Wu, Hao
Wu, Xihong
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1961 - 1964
[2] Context-dependent acoustic models for Chinese speech recognition
Ma, B
Huang, TY
Xu, B
Zhang, XJ
Qu, F
[J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 455 - 458
[3] Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese
Wu, JJX
Deng, L
Chan, J
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2281 - 2284
[4] A frame-based context-dependent acoustic modeling for speech recognition
Terashima R.
Zen H.
Nankaku Y.
Tokuda K.
[J]. IEEJ Transactions on Electronics, Information and Systems, 2010, 130 (10) : 1856 - 1864+24
[5] Context modeling and clustering in continuous speech recognition
Junqua, JC
Vassallo, L
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2262 - 2265
[6] Automatic Initial/Final Generation for Dialectal Chinese Speech Recognition
Liu, Linquan
Zheng, Thomas Fang
Wu, Wenhu
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 141 - 144
[7] Context-dependent acoustic modeling using graphemes for large vocabulary speech recognition
Kanthak, S
Ney, H
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 845 - 848
[8] Study on Acoustic Modeling in a Mandarin Continuous Speech Recognition
PENG Di
[J]. International Journal of Mining Science and Technology, 2007, (01) : 143 - 146
[9] Phone-context specific gender-dependent acoustic-models for continuous speech recognition
Neti, C
Roukos, S
[J]. 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 192 - 198
[10] Prosody-dependent Acoustic Modeling for Mandarin Speech Recognition
Chiu, Tzu-Hsuan
Chiang, Chen-Yu
Liao, Yuan-Fu
Yang, Jyh-Her
Wang, Yih-Ru
Chen, Sin-Horng
[J]. PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 139 - 142

← 1 2 3 4 5 →