Increasing Prosodic Variability of Text-To-Speech Synthesizers

被引：0

作者：

Nemeth, Geza ^{[1
]}

Fek, Mark ^{[1
]}

Csapo, Tamas Gabor ^{[1
]}

机构：

[1] Budapest Univ Technol & Econ, Dept Telecommun & Media Informat, Budapest, Hungary

来源：

INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年

关键词：

speech synthesis; prosodic variability; F-0; variation; transplantation;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The lack of prosody variation in text-to-speech systems contributes to their perceived unnaturalness when synthesizing extended passages. In this paper, we present a method to improve prosody generation in this direction. A database of natural sample sentences is searched for sentences having similar word and syllable structure to the input. One sentence is selected randomly from the similar sentences found. The prosody of the randomly selected natural sentence is used as a target to generate the prosody of the synthetic one. An experiment was conducted to determine the potential of the proposed method. The rule-based pitch contour generation of a Hungarian concatenative synthesizer was replaced by a semi-automatic implementation of the proposed method. A listening test showed that subjects preferred sentences synthesized by the proposed method over a rule-based solution.

引用

页码：1981 / 1984

页数：4

共 50 条

[1] A prosodic Turkish text-to-speech synthesizer
Vural, E
Oflazer, K
[J]. PROCEEDINGS OF THE IEEE 12TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, 2004, : 458 - 460
[2] CONTROLLING PHONEME SYNTHESIZERS IN TEXT-TO-SPEECH SYSTEMS
RUHL, HW
DREISSIG, D
KULAS, W
[J]. NTZ ARCHIV, 1984, 6 (10): : 243 - 248
[3] A prosodic model for text-to-speech synthesis in French
Di Cristo, A
Di Cristo, P
Campione, E
Véronis, J
[J]. INTONATION: ANALYSIS, MODELLING AND TECHNOLOGY, 2000, 15 : 321 - 355
[4] A Prosodic Text-to-Speech System for Yoruba Language
Akinwonmi, Akintoba Emmanuel
Alese, Boniface Kayode
[J]. 2013 8TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2013, : 630 - 635
[5] ON GRANULARITY OF PROSODIC REPRESENTATIONS IN EXPRESSIVE TEXT-TO-SPEECH
Babianski, Mikolaj
Pokora, Kamil
Shah, Raahil
Sienkiewicz, Rafal
Korzekwa, Daniel
Klimkov, Viacheslav
[J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 892 - 899
[6] Prosodic Annotation in a Thai Text-to-speech System
Potisuk, Siripong
[J]. PACLIC 21: THE 21ST PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, PROCEEDINGS, 2007, : 405 - 414
[7] Evaluating Arabic Text-To-Speech Synthesizers for Mobile Phones
AlRouqi, Hend
Alhadhrami, Suheer
Al-Khalifa, Hend S.
Al-Salman, AbdulMalik S.
Alarifi, Abdulrahman
Alnafessah, Ahmad
Al-Ammar, Mai A.
[J]. 2015 Tenth International Conference on Digital Information Management (ICDIM), 2015, : 41 - 46
[8] Speech synthesis for text-to-speech alignment and prosodic feature extraction
Malfrere, F
Dutoit, T
[J]. ISCAS '97 - PROCEEDINGS OF 1997 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS I - IV: CIRCUITS AND SYSTEMS IN THE INFORMATION AGE, 1997, : 2637 - 2640
[9] Prosodic boundary prediction model for Vietnamese text-to-speech
Trang, Nguyen Thi Thu
Ky, Nguyen Hoang
Rilliard, Albert
D'Alessandro, Christophe
[J]. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2021, 5 : 3366 - 3370
[10] Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech
Nguyen Thi Thu Trang
Nguyen Hoang Ky
Rilliard, Albert
d'Alessandro, Christophe
[J]. INTERSPEECH 2021, 2021, : 3885 - 3889

← 1 2 3 4 5 →