Mandarin pronunciation modeling based on CASS corpus

被引:8
|
作者
Zheng, F [1 ]
Song, ZJ
Fung, P
Byrne, W
机构
[1] Tsing Hua Univ, Dept Comp Sci & Technol, State Key Lab Intelligent Technol & Syst, Ctr Speech Technol, Beijing 100084, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Elect & Elect Engn, Hong Kong, Hong Kong, Peoples R China
[3] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
基金
美国国家科学基金会;
关键词
pronunciation modeling; generalized initial and final; generalized syllable; refined acoustic modeling; context-dependent weighting; iterative forced-alignment based transcribing;
D O I
10.1007/BF02947304
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The pronunciation variability is an important issue that must be faced with when developing practical automatic spontaneous speech recognition systems. In this paper, the factors that may affect the recognition performance are analyzed, including those specific to the Chinese language. By studying the INITIAL/FINAL (IF) characteristics of Chinese language and developing the Bayesian equation, the concepts of generalized INITIAL/FINAL (GIF) and generalized syllable (GS), the GIF modeling and the IF-GIF modeling, as well as the context-dependent pronunciation weighting, are proposed based on a well phonetically transcribed seed database. By using these methods, the Chinese syllable error rate (SER) is reduced by 6.3% and 4.2% compared with the GIF modeling and IF modeling respectively when the language model, such as syllable or word N-gram, is not used. The effectiveness of these methods is also proved when more data without the phonetic transcription are used to refine the acoustic model using the proposed iterative forced-alignment based transcribing (IFABT) method, achieving a 5.7% SER reduction.
引用
收藏
页码:249 / 263
页数:15
相关论文
共 50 条
  • [21] the improvements on automatic mandarin pronunciation evaluation
    He Yi
    Qi Xin
    Xiao Yunpeng
    Xu Xiaoying
    Zhao Xinru
    Ye Weiping
    [J]. PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 495 - 499
  • [22] MANDARIN SPEECH RECOGNITION FOR NONNATIVE SPEAKERS BASED ON PRONUNCIATION DICTIONARY ADAPTATION
    Yang, Jian
    Wu, Peishan
    Xu, Dan
    [J]. 2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 217 - 220
  • [23] Automatic Pronunciation Scoring for Mandarin Proficiency Test based on Speech Recognition
    Liu, Yang
    Yang, Chunting
    Ma, Weifeng
    [J]. 2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT UBIQUITOUS COMPUTING AND EDUCATION, 2009, : 168 - 171
  • [24] Pronunciation modeling with reduced confusion for Mandarin Chinese using a three-stage framework
    Tsai, Ming-Yi
    Chou, Fu-Chiang
    Lee, Lin-Shan
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (02): : 661 - 675
  • [25] Aspect in Mandarin Chinese: A Corpus-based Study
    Ming, Tao
    [J]. CHINESE LANGUAGE AND DISCOURSE, 2010, 1 (01) : 138 - 144
  • [26] Gabor Based Lipreading with a New Audiovisual Mandarin Corpus
    Xu, Yan
    Li, Yuexuan
    Abel, Andrew
    [J]. ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, 2020, 11691 : 169 - 179
  • [27] Corpus-based learning of Cantonese for Mandarin speakers
    Wong, Tak-Sum
    Lee, John S. Y.
    [J]. RECALL, 2016, 28 (02) : 187 - 206
  • [28] A Corpus-based Pronunciation Teaching Model: A Conceptual Paper
    Qian, Bojie
    Deris, Farhana Diana
    [J]. ARAB WORLD ENGLISH JOURNAL, 2023, 14 (01) : 71 - 88
  • [29] A MULTIMEDIA CORPUS OF CHILD MANDARIN: THE TONG CORPUS
    Deng Xiangjun
    Yip, Virginia
    [J]. JOURNAL OF CHINESE LINGUISTICS, 2018, 46 (01) : 69 - 92
  • [30] The Influence of Mandarin and WMD over English Pronunciation
    肖雪莲
    [J]. 海外英语, 2012, (24) : 267 - 267