Automatic generation of synthesis units and prosodic information for Chinese concatenative synthesis

被引：39

作者：

Wu, CH ^{[1
]}

Chen, JH ^{[1
]}

机构：

[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan, Taiwan

来源：

SPEECH COMMUNICATION | 2001年 / 35卷 / 3-4期

关键词：

Chinese text-to-speech conversion; synthesis units; prosodic information; concatenative synthesis; pitch contour; syllable duration;

D O I：

10.1016/S0167-6393(00)00075-3

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, some approaches to the generation of synthesis units and prosodic information are proposed for Mandarin Chinese text-to-speech (TTS) conversion. The monosyllables are adopted as the basic synthesis units. A. set of synthesis units is selected from a large continuous speech database based on two cost functions, which minimize the inter- and intra-syllable distortion. The speech database is also employed to establish a word-prosody-based template tree according to the linguistic features: tone combination, word length, part-of-speech (POS) of the word, and word position in a phrase. This template tree stores them prosodic features including pitch contour, average energy, and syllable duration of a word for possible combinations of linguistic features. Two modules for sentence intonation and template selection are proposed to generate the target prosodic templates. The experimental results showed that the synthesized prosodic features matched quite well with their original counterparts. Evaluation by subjective experiments also confirmed the satisfactory performance of these approaches. (C) 2001 Elsevier Science B.V. All rights reserved.

引用

页码：219 / 237

页数：19

共 50 条

[1] Template-driven generation of prosodic information for Chinese concatenative synthesis
Wu, CH
Chen, JH
[J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 65 - 68
[2] SET OF CONCATENATIVE UNITS FOR SPEECH SYNTHESIS
OLIVE, J
LIBERMAN, M
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 : S130 - S130
[3] Automatic Labeling Schemes for Concatenative Speech Synthesis
Kacur, Juraj
Cepko, Jozef
Palenik, Andrej
[J]. PROCEEDINGS ELMAR-2008, VOLS 1 AND 2, 2008, : 639 - 642
[4] Automatic generation of prosodic structure for high quality Mandarin speech synthesis
Chou, FC
Tseng, CY
Lee, LS
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1624 - 1627
[5] Automatic segmentation for construction of signal dictionary in concatenative synthesis
Chowdhury, S
Datta, AK
Chaudhuri, BB
[J]. 6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL III, PROCEEDINGS: IMAGE, ACOUSTIC, SPEECH AND SIGNAL PROCESSING I, 2002, : 237 - 240
[6] An evaluation of automatic phone segmentation for concatenative speech synthesis
Kawai, H
Toda, T
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 677 - 680
[7] Automatic generation of speech synthesis units based on closed loop training
Kagoshima, T
Akamine, M
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 963 - 966
[8] Automatic generation of synthesis units for trainable text-to-speech systems
Hon, H
Acero, A
Huang, X
Liu, J
Plumpe, M
[J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 293 - 296
[9] Prosodic Processing for the Automatic Synthesis of Emotional Russian Speech
Kaliyev, Arman
Matveev, Yuri N.
Lyakso, Elena E.
Rybin, Sergey V.
[J]. 2018 IEEE INTERNATIONAL CONFERENCE QUALITY MANAGEMENT, TRANSPORT AND INFORMATION SECURITY, INFORMATION TECHNOLOGIES (IT&QM&IS), 2018, : 653 - 655
[10] TREE-BASED APPROACHES TO AUTOMATIC-GENERATION OF SPEECH SYNTHESIS RULES FOR PROSODIC PARAMETERS
YAMASHITA, Y
TANAKA, M
AMAKO, Y
NOMURA, Y
OHTA, Y
KITOH, A
KAKUSHO, O
MIZOGUCHI, R
[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1934 - 1941

← 1 2 3 4 5 →