Mandarin pronunciation modeling based on CASS corpus

被引:8
|
作者
Zheng, F [1 ]
Song, ZJ
Fung, P
Byrne, W
机构
[1] Tsing Hua Univ, Dept Comp Sci & Technol, State Key Lab Intelligent Technol & Syst, Ctr Speech Technol, Beijing 100084, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Elect & Elect Engn, Hong Kong, Hong Kong, Peoples R China
[3] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
来源
基金
美国国家科学基金会;
关键词
pronunciation modeling; generalized initial and final; generalized syllable; refined acoustic modeling; context-dependent weighting; iterative forced-alignment based transcribing;
D O I
10.1007/BF02947304
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The pronunciation variability is an important issue that must be faced with when developing practical automatic spontaneous speech recognition systems. In this paper, the factors that may affect the recognition performance are analyzed, including those specific to the Chinese language. By studying the INITIAL/FINAL (IF) characteristics of Chinese language and developing the Bayesian equation, the concepts of generalized INITIAL/FINAL (GIF) and generalized syllable (GS), the GIF modeling and the IF-GIF modeling, as well as the context-dependent pronunciation weighting, are proposed based on a well phonetically transcribed seed database. By using these methods, the Chinese syllable error rate (SER) is reduced by 6.3% and 4.2% compared with the GIF modeling and IF modeling respectively when the language model, such as syllable or word N-gram, is not used. The effectiveness of these methods is also proved when more data without the phonetic transcription are used to refine the acoustic model using the proposed iterative forced-alignment based transcribing (IFABT) method, achieving a 5.7% SER reduction.
引用
收藏
页码:249 / 263
页数:15
相关论文
共 50 条
  • [1] Mandarin pronunciation modeling based on CASS corpus
    Fang Zheng
    Zhanjiang Song
    Pascale Fung
    Byrne William
    [J]. Journal of Computer Science and Technology, 2002, 17 : 249 - 263
  • [2] A CORPUS FOR THE STUDY ON THE ASSESSMENT OF MANDARIN PRONUNCIATION OF TIBETAN SPEAKERS
    Gan, Z.
    Jiang, J.
    Yan, Y.
    Yang, H.
    [J]. 14TH INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE (INTED2020), 2020, : 7840 - 7848
  • [3] Pronunciation Variation Modeling for Mandarin with Accent
    Zhang Chi
    Wu Ji
    Xiao Xi
    Wang Zuoying
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 709 - 712
  • [4] MANDARIN MULTIMEDIA CHILD SPEECH CORPUS: CASS_ CHILD
    Gao, Jun
    Li, Aijun
    Xiong, Ziyu
    [J]. 2012 INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2012, : 7 - 12
  • [5] Pronunciation Modeling for Spontaneous Mandarin Speech Recognition
    Yi Liu
    Pascale Fung
    [J]. International Journal of Speech Technology, 2004, 7 (2-3) : 155 - 172
  • [6] Designing and implementing a corpus-based online pronunciation learning platform for Cantonese learners of Mandarin
    Chen, Hsueh Chu
    Han, Qian Wen
    [J]. INTERACTIVE LEARNING ENVIRONMENTS, 2020, 28 (01) : 18 - 31
  • [7] Modeling partial pronunciation variations for spontaneous Mandarin speech recognition
    Liu, Y
    Fung, P
    [J]. COMPUTER SPEECH AND LANGUAGE, 2003, 17 (04): : 357 - 379
  • [8] A DNN-BASED ACOUSTIC MODELING OF TONAL LANGUAGE AND ITS APPLICATION TO MANDARIN PRONUNCIATION TRAINING
    Hu, Wenping
    Qian, Yao
    Soong, Frank K.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [9] Mandarin accent adaptation based on context-independent/context-dependent pronunciation modeling
    Liu, MK
    Xu, B
    Huang, TY
    Deng, YG
    Li, CR
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1025 - 1028
  • [10] Tongue Visualization Model for Mandarin Pronunciation Based on MRI
    Zhang, S. C.
    Liu, C.
    Li, F. J.
    Wang, L.
    Niu, H. J.
    [J]. 12TH ASIAN-PACIFIC CONFERENCE ON MEDICAL AND BIOLOGICAL ENGINEERING, VOL 1, APCMBE 2023, 2024, 103 : 362 - 369