Context dependent initial/final acoustic modeling for continuous Chinese speech recognition

被引：0

作者：

Li, Jing ^{[1
]}

Zheng, Fang ^{[1
,2
]}

Zhang, Jiyong ^{[1
]}

Wu, Wenhu ^{[1
]}

机构：

[1] Lab. of Intelligent Technol., Dept. of Comp. Sci. and Technol., Tsinghua Univ., Beijing 100084, China

[2] Beijing d-Ear Technol. Co. Ltd., Beijing 100085, China

来源：

Qinghua Daxue Xuebao/Journal of Tsinghua University | 2004年 / 44卷 / 01期

关键词：

Decision support systems - Mathematical models;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Acoustic modeling is important for continuous Chinese speech recognition. The extended initial/final (XIF) set is the basic speech recognition unit set to analyze the Chinese language characteristics outperformed the standard IF set. Decision tree-based state tying technology was used to construct the content dependent initial/final acoustic model (Tri-XIF model), with an appropriate question set design based on Chinese linguistic knowledge. The methods were developed to optimize the Tri-XIF modeling, including transcription refinement, question set extension, and model size reduction. Tests show that the Tri-XIF modeling is much better than either Tri-phone modeling or syllable modeling, the syllable error rate reduced by 24.53% comparing to the Tri-phone model and 41.65% to syllable model. The model size was reduced by above 20% with little performance deterioration using the methods in the Tri-XIF model.

引用

页码：61 / 64

共 50 条

[21] PHMM BASED ASYNCHRONOUS ACOUSTIC MODEL FOR CHINESE LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Wu, Hao
Wu, Xihong
Chi, Huisheng
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4477 - 4480
[22] Language Modeling of Chinese Personal Names Based on Character Units for Continuous Chinese Speech Recognition
Hu, Xinhui
Yamamoto, Hirofumi
Kikui, Genichiro
Sagisaka, Yoshinori
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1874 - +
[23] MDL-based context-dependent subword modeling for speech recognition
Shinoda, Koichi
Watanabe, Takao
[J]. Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), 2000, 21 (02): : 79 - 86
[24] LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION WITH CONTEXT-DEPENDENT DBN-HMMS
Dahl, George E.
Yu, Dong
Deng, Li
Acero, Alex
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4688 - 4691
[25] Research on context-dependent acoustical unit (triphone) for mandarin continuous speech recognition
Zhao, Qingwei
Wang, Zuoying
Lu, Dajin
[J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 1999, 27 (06): : 79 - 82
[26] Research on context-dependent acoustical unit (triphone) for mandarin continuous speech recognition
Tsinghua Univ, Beijing, China
[J]. Tien Tzu Hsueh Pao, 6 (79-82, 117):
[27] Integration of speech and language processing in Chinese continuous speech recognition
ZHAO Li ZOU Cairong WU Zhenyang(Department of Radio Engineering
[J]. Chinese Journal of Acoustics, 2002, (04) : 343 - 351
[28] Multidialectal Spanish acoustic modeling for speech recognition
Caballero, Monica
Moreno, Asuncion
Nogueiras, Albino
[J]. SPEECH COMMUNICATION, 2009, 51 (03) : 217 - 229
[29] Joint acoustic and language modeling for speech recognition
Chien, Jen-Tzung
Chueh, Chuang-Hua
[J]. SPEECH COMMUNICATION, 2010, 52 (03) : 223 - 235
[30] Acoustic Modeling in Speech Recognition: A Systematic Review
Bhatt, Shobha
Jain, Anurag
Dev, Amita
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (04) : 397 - 412

← 1 2 3 4 5 →