Tone articulation modeling for mandarin spontaneous speech recognition

被引:0
|
作者
Zhou, JL
Tian, Y
Shi, Y
Huang, C
Chang, E
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Tone modeling is an unavoidable problem in Mandarin speech recognition. In continuous speech, the pitch contour exhibits variable patterns, and it is strongly influenced by its tone context. Although several effective methods have been proposed to improve the accuracy for tonal syllables in Mandarin continuous speech recognition, many recognition errors are caused by poor tone discrimination capability of the acoustic model [1][2][3][4]. Furthermore, the case becomes worse for the recognition of spontaneous speech. In this paper, we will report our work on tone articulation modeling. Tone context dependent models are used to model unstable pitch patterns caused by co-articulation in continuous speech. Corresponding acoustic features are investigated as well. Our methods are evaluated on two test sets: one is reading-style speech data, the other is spontaneous. The experimental results show that for the test set of casual speech, the proposed method turns out to be more effective than tone context independent model, while they are comparable for the test set of reading-style speech. Several factors which have potential to improve the proposed method are discussed in the final part in this paper.
引用
收藏
页码:997 / 1000
页数:4
相关论文
共 50 条
  • [1] Tone Modeling for Continuous Mandarin Speech Recognition
    Cao, Yang
    Zhang, Shuwu
    Huang, Taiyi
    Xu, Bo
    [J]. International Journal of Speech Technology, 2004, 7 (2-3) : 115 - 128
  • [2] Pronunciation Modeling for Spontaneous Mandarin Speech Recognition
    Yi Liu
    Pascale Fung
    [J]. International Journal of Speech Technology, 2004, 7 (2-3) : 155 - 172
  • [3] MAXIMUM ENTROPY BASED TONE MODELING FOR MANDARIN SPEECH RECOGNITION
    Wang, Xinhao
    Yu, Yansuo
    Wu, Xihong
    Chi, Huisheng
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4850 - 4853
  • [4] Improved Tone Modeling for Mandarin Broadcast News Speech Recognition
    Lei, Xin
    Siu, Manhung
    Hwang, Mei-Yuh
    Ostendorf, Mari
    Lee, Tan
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1237 - +
  • [5] WORD-LEVEL TONE MODELING FOR MANDARIN SPEECH RECOGNITION
    Lei, Xin
    Ostendorf, Mari
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 665 - +
  • [6] IMPROVED TONE MODELING BY EXPLOITING ARTICULATORY FEATURES FOR MANDARIN SPEECH RECOGNITION
    Chao, Hao
    Yang, Zhanlei
    Liu, Wenju
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4741 - 4744
  • [7] Modeling partial pronunciation variations for spontaneous Mandarin speech recognition
    Liu, Y
    Fung, P
    [J]. COMPUTER SPEECH AND LANGUAGE, 2003, 17 (04): : 357 - 379
  • [8] A TONE RECOGNITION FRAMEWORK FOR CONTINUOUS MANDARIN SPEECH
    He, Lei
    Hao, Jie
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1575 - 1578
  • [9] An Investigation of the Target Approximation Model for Tone Modeling and Recognition in Continuous Mandarin Speech
    Gao, Yingming
    Zhang, Xinyu
    Xu, Yi
    Zhang, Jinsong
    Birkholz, Peter
    [J]. INTERSPEECH 2020, 2020, : 1913 - 1917
  • [10] Different facial cues for different speech styles in Mandarin tone articulation
    Garg, Saurabh
    Hamarneh, Ghassan
    Sereno, Joan
    Jongman, Allard
    Wang, Yue
    [J]. FRONTIERS IN COMMUNICATION, 2023, 8