Pitch models of Mandarin text-to-speech

被引:0
|
作者
邵艳秋 [1 ,2 ]
穗志方 [1 ]
韩纪庆 [2 ]
机构
[1] Institute of Computational Linguistics,Peking University
[2] School of Computer Science and Technology,Harbin Institute of
关键词
D O I
暂无
中图分类号
学科分类号
摘要
The function of prosody model will directly affect the naturalness of synthesized speech.Aimed at the difficulty in generating the pitch contour in prosody model,two pitch models namely corpus-based pitch model and pitch pattern model are deeply studied in this paper.Key problems in the corpus-based model are calculation of the distance and searching of the optimal path with dynamic programming algorithm.For the pitch pattern model,parameters such as pitch pattern,pitch average and pitch range are used to describe the pitch contour,and six pitch patterns are presented.For the generation of pitch contour,the pitch pattern model is more flexible than the corpus-based model.Both of the two models are linked to the real TTS system,and the MOS results of synthesized Mandarin speech show that the pitch pattern model is better than the corpus-based pitch model.
引用
收藏
页码:179 / 184
页数:6
相关论文
共 50 条
  • [21] High-quality prosody generation in Mandarin text-to-speech system
    Guo, Qing
    Zhang, Jie
    Katae, Nobuyuki
    Yu, Hao
    Fujitsu Scientific and Technical Journal, 2010, 46 (01): : 40 - 46
  • [22] Prosody model in a Mandarin Text-to-Speech System based on a hierarchical approach
    Pan, NH
    Jen, WT
    Yu, SS
    Yu, MS
    Huang, SY
    Wu, MJ
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 448 - 451
  • [23] An RNN-based prosodic information synthesizer for Mandarin text-to-speech
    Chen, SH
    Hwang, SH
    Wang, YR
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03): : 226 - 239
  • [24] AN RNN-BASED SPECTRAL INFORMATIONG ENERATION FOR MANDARIN TEXT-TO-SPEECH
    EOOO/CCL, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan
    不详
    Eur. Conf. Speech Commun. Technol., EUROSPEECH, 1600, (549-552):
  • [25] Including Pitch Accent Optionality in Unit Selection Text-to-Speech Synthesis
    Badino, Leonardo
    Clark, Robert A. J.
    Strom, Volker
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2118 - 2121
  • [26] A mandarin text-to-speech technique implemented on a PIC-based microcontroller platform
    Yeh, Cheng-Yu
    Chang, Chih-Hsuan
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2016, 11 : S60 - S64
  • [27] A statistical model with hierarchical structure for predicting prosody in a mandarin text-to-speech system
    Yu, MS
    Pan, NH
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2005, 28 (03) : 385 - 399
  • [28] A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese
    Chou, FC
    Tseng, CY
    Lee, LS
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (07): : 481 - 494
  • [29] Integrating coding techniques into LP-based Mandarin text-to-speech synthesis
    Hu H.-T.
    Wang H.-M.
    Int J Speech Technol, 2007, 1 (31-44): : 31 - 44
  • [30] STRESS PREDICITION FOR MANDARIN TEXT-TO-SPEECH SYSTEM USING DISCOURSE CONTEXT FEATURE
    Che, Hao
    Tao, Jianhua
    2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,