Pitch models of Mandarin text-to-speech

被引：0

作者：

邵艳秋 ^{[1
,2
]}

穗志方 ^{[1
]}

韩纪庆 ^{[2
]}

机构：

[1] Institute of Computational Linguistics,Peking University

[2] School of Computer Science and Technology,Harbin Institute of

来源：

Journal of Harbin Institute of Technology | 2009年 / 16卷 / 02期

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The function of prosody model will directly affect the naturalness of synthesized speech.Aimed at the difficulty in generating the pitch contour in prosody model,two pitch models namely corpus-based pitch model and pitch pattern model are deeply studied in this paper.Key problems in the corpus-based model are calculation of the distance and searching of the optimal path with dynamic programming algorithm.For the pitch pattern model,parameters such as pitch pattern,pitch average and pitch range are used to describe the pitch contour,and six pitch patterns are presented.For the generation of pitch contour,the pitch pattern model is more flexible than the corpus-based model.Both of the two models are linked to the real TTS system,and the MOS results of synthesized Mandarin speech show that the pitch pattern model is better than the corpus-based pitch model.

引用

页码：179 / 184

页数：6

共 50 条

[21] High-quality prosody generation in Mandarin text-to-speech system
Guo, Qing
Zhang, Jie
Katae, Nobuyuki
Yu, Hao
Fujitsu Scientific and Technical Journal, 2010, 46 (01): : 40 - 46
[22] Prosody model in a Mandarin Text-to-Speech System based on a hierarchical approach
Pan, NH
Jen, WT
Yu, SS
Yu, MS
Huang, SY
Wu, MJ
2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 448 - 451
[23] An RNN-based prosodic information synthesizer for Mandarin text-to-speech
Chen, SH
Hwang, SH
Wang, YR
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03): : 226 - 239
[24] AN RNN-BASED SPECTRAL INFORMATIONG ENERATION FOR MANDARIN TEXT-TO-SPEECH
EOOO/CCL, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan
不详
Eur. Conf. Speech Commun. Technol., EUROSPEECH, 1600, (549-552):
[25] Including Pitch Accent Optionality in Unit Selection Text-to-Speech Synthesis
Badino, Leonardo
Clark, Robert A. J.
Strom, Volker
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2118 - 2121
[26] A mandarin text-to-speech technique implemented on a PIC-based microcontroller platform
Yeh, Cheng-Yu
Chang, Chih-Hsuan
IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2016, 11 : S60 - S64
[27] A statistical model with hierarchical structure for predicting prosody in a mandarin text-to-speech system
Yu, MS
Pan, NH
JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2005, 28 (03) : 385 - 399
[28] A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese
Chou, FC
Tseng, CY
Lee, LS
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (07): : 481 - 494
[29] Integrating coding techniques into LP-based Mandarin text-to-speech synthesis
Hu H.-T.
Wang H.-M.
Int J Speech Technol, 2007, 1 (31-44): : 31 - 44
[30] STRESS PREDICITION FOR MANDARIN TEXT-TO-SPEECH SYSTEM USING DISCOURSE CONTEXT FEATURE
Che, Hao
Tao, Jianhua
2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,

← 1 2 3 4 5 →