Context dependent initial/final acoustic modeling for continuous Chinese speech recognition

被引:0
|
作者
Li, Jing [1 ]
Zheng, Fang [1 ,2 ]
Zhang, Jiyong [1 ]
Wu, Wenhu [1 ]
机构
[1] Lab. of Intelligent Technol., Dept. of Comp. Sci. and Technol., Tsinghua Univ., Beijing 100084, China
[2] Beijing d-Ear Technol. Co. Ltd., Beijing 100085, China
关键词
Decision support systems - Mathematical models;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Acoustic modeling is important for continuous Chinese speech recognition. The extended initial/final (XIF) set is the basic speech recognition unit set to analyze the Chinese language characteristics outperformed the standard IF set. Decision tree-based state tying technology was used to construct the content dependent initial/final acoustic model (Tri-XIF model), with an appropriate question set design based on Chinese linguistic knowledge. The methods were developed to optimize the Tri-XIF modeling, including transcription refinement, question set extension, and model size reduction. Tests show that the Tri-XIF modeling is much better than either Tri-phone modeling or syllable modeling, the syllable error rate reduced by 24.53% comparing to the Tri-phone model and 41.65% to syllable model. The model size was reduced by above 20% with little performance deterioration using the methods in the Tri-XIF model.
引用
收藏
页码:61 / 64
相关论文
共 50 条
  • [21] PHMM BASED ASYNCHRONOUS ACOUSTIC MODEL FOR CHINESE LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
    Wu, Hao
    Wu, Xihong
    Chi, Huisheng
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4477 - 4480
  • [22] Language Modeling of Chinese Personal Names Based on Character Units for Continuous Chinese Speech Recognition
    Hu, Xinhui
    Yamamoto, Hirofumi
    Kikui, Genichiro
    Sagisaka, Yoshinori
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1874 - +
  • [23] MDL-based context-dependent subword modeling for speech recognition
    Shinoda, Koichi
    Watanabe, Takao
    [J]. Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), 2000, 21 (02): : 79 - 86
  • [24] LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION WITH CONTEXT-DEPENDENT DBN-HMMS
    Dahl, George E.
    Yu, Dong
    Deng, Li
    Acero, Alex
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4688 - 4691
  • [25] Research on context-dependent acoustical unit (triphone) for mandarin continuous speech recognition
    Zhao, Qingwei
    Wang, Zuoying
    Lu, Dajin
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 1999, 27 (06): : 79 - 82
  • [26] Research on context-dependent acoustical unit (triphone) for mandarin continuous speech recognition
    Tsinghua Univ, Beijing, China
    [J]. Tien Tzu Hsueh Pao, 6 (79-82, 117):
  • [27] Integration of speech and language processing in Chinese continuous speech recognition
    ZHAO Li ZOU Cairong WU Zhenyang(Department of Radio Engineering
    [J]. Chinese Journal of Acoustics, 2002, (04) : 343 - 351
  • [28] Multidialectal Spanish acoustic modeling for speech recognition
    Caballero, Monica
    Moreno, Asuncion
    Nogueiras, Albino
    [J]. SPEECH COMMUNICATION, 2009, 51 (03) : 217 - 229
  • [29] Joint acoustic and language modeling for speech recognition
    Chien, Jen-Tzung
    Chueh, Chuang-Hua
    [J]. SPEECH COMMUNICATION, 2010, 52 (03) : 223 - 235
  • [30] Acoustic Modeling in Speech Recognition: A Systematic Review
    Bhatt, Shobha
    Jain, Anurag
    Dev, Amita
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (04) : 397 - 412