Pitch models of Mandarin text-to-speech

被引:0
|
作者
邵艳秋 [1 ,2 ]
穗志方 [1 ]
韩纪庆 [2 ]
机构
[1] Institute of Computational Linguistics,Peking University
[2] School of Computer Science and Technology,Harbin Institute of
关键词
D O I
暂无
中图分类号
学科分类号
摘要
The function of prosody model will directly affect the naturalness of synthesized speech.Aimed at the difficulty in generating the pitch contour in prosody model,two pitch models namely corpus-based pitch model and pitch pattern model are deeply studied in this paper.Key problems in the corpus-based model are calculation of the distance and searching of the optimal path with dynamic programming algorithm.For the pitch pattern model,parameters such as pitch pattern,pitch average and pitch range are used to describe the pitch contour,and six pitch patterns are presented.For the generation of pitch contour,the pitch pattern model is more flexible than the corpus-based model.Both of the two models are linked to the real TTS system,and the MOS results of synthesized Mandarin speech show that the pitch pattern model is better than the corpus-based pitch model.
引用
收藏
页码:179 / 184
页数:6
相关论文
共 50 条
  • [41] Text and Speech Corpora for Text-To-Speech Synthesis of Tales
    Doukhan, David
    Rosset, Sophie
    Rilliard, Albert
    d'Alessandro, Christophe
    Adda-Decker, Martine
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1003 - 1010
  • [42] A novel prosody adaptation method for Mandarin concatenation-based text-to-speech system
    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China
    Acoust. Sci. Technol., 1 (33-41):
  • [43] Pre-trained Text Representations for Improving Front-End Text Processing in Mandarin Text-to-Speech Synthesis
    Yang, Bing
    Zhong, Jiaqi
    Liu, Shan
    INTERSPEECH 2019, 2019, : 4480 - 4484
  • [44] A novel prosody adaptation method for Mandarin concatenation-based text-to-speech system
    Yu, Jian
    Tao, Jianhua
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2009, 30 (01) : 33 - 41
  • [45] A two-stage prosodic structure generation strategy for Mandarin text-to-speech systems
    Dong Y.
    Zhou T.
    Dong C.-Y.
    Wang H.-L.
    Zidonghua Xuebao/Acta Automatica Sinica, 2010, 36 (11): : 1569 - 1574
  • [46] Automatic conversion from lexical words to prosodic words for mandarin text-to-speech system
    Shao, Yanqiu
    Han, Jiqing
    Liu, Ting
    Zhao, Yongzhen
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2007, 10 (01) : 45 - 55
  • [47] Expressive Visual Text-To-Speech Using Active Appearance Models
    Anderson, Robert
    Stenger, Bjoern
    Wan, Vincent
    Cipolla, Roberto
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 3382 - 3389
  • [48] Slovenian text-to-speech system
    Sef, T
    ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL V: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 41 - 44
  • [49] Multilingual text-to-speech synthesis
    Black, AW
    Lenzo, KA
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 761 - 764
  • [50] JAPANESE TEXT-TO-SPEECH SYNTHESIZER
    NAGAKURA, K
    HAKODA, K
    KABEYA, K
    REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1988, 36 (05): : 451 - 457