Pitch models of Mandarin text-to-speech

被引：0

作者：

邵艳秋 ^{[1
,2
]}

穗志方 ^{[1
]}

韩纪庆 ^{[2
]}

机构：

[1] Institute of Computational Linguistics,Peking University

[2] School of Computer Science and Technology,Harbin Institute of

来源：

Journal of Harbin Institute of Technology | 2009年 / 16卷 / 02期

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The function of prosody model will directly affect the naturalness of synthesized speech.Aimed at the difficulty in generating the pitch contour in prosody model,two pitch models namely corpus-based pitch model and pitch pattern model are deeply studied in this paper.Key problems in the corpus-based model are calculation of the distance and searching of the optimal path with dynamic programming algorithm.For the pitch pattern model,parameters such as pitch pattern,pitch average and pitch range are used to describe the pitch contour,and six pitch patterns are presented.For the generation of pitch contour,the pitch pattern model is more flexible than the corpus-based model.Both of the two models are linked to the real TTS system,and the MOS results of synthesized Mandarin speech show that the pitch pattern model is better than the corpus-based pitch model.

引用

页码：179 / 184

页数：6

共 50 条

[41] Text and Speech Corpora for Text-To-Speech Synthesis of Tales
Doukhan, David
Rosset, Sophie
Rilliard, Albert
d'Alessandro, Christophe
Adda-Decker, Martine
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1003 - 1010
[42] A novel prosody adaptation method for Mandarin concatenation-based text-to-speech system
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China
Acoust. Sci. Technol., 1 (33-41):
[43] Pre-trained Text Representations for Improving Front-End Text Processing in Mandarin Text-to-Speech Synthesis
Yang, Bing
Zhong, Jiaqi
Liu, Shan
INTERSPEECH 2019, 2019, : 4480 - 4484
[44] A novel prosody adaptation method for Mandarin concatenation-based text-to-speech system
Yu, Jian
Tao, Jianhua
ACOUSTICAL SCIENCE AND TECHNOLOGY, 2009, 30 (01) : 33 - 41
[45] A two-stage prosodic structure generation strategy for Mandarin text-to-speech systems
Dong Y.
Zhou T.
Dong C.-Y.
Wang H.-L.
Zidonghua Xuebao/Acta Automatica Sinica, 2010, 36 (11): : 1569 - 1574
[46] Automatic conversion from lexical words to prosodic words for mandarin text-to-speech system
Shao, Yanqiu
Han, Jiqing
Liu, Ting
Zhao, Yongzhen
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2007, 10 (01) : 45 - 55
[47] Expressive Visual Text-To-Speech Using Active Appearance Models
Anderson, Robert
Stenger, Bjoern
Wan, Vincent
Cipolla, Roberto
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 3382 - 3389
[48] Slovenian text-to-speech system
Sef, T
ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL V: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 41 - 44
[49] Multilingual text-to-speech synthesis
Black, AW
Lenzo, KA
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 761 - 764
[50] JAPANESE TEXT-TO-SPEECH SYNTHESIZER
NAGAKURA, K
HAKODA, K
KABEYA, K
REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1988, 36 (05): : 451 - 457

← 1 2 3 4 5 →