A prosodic diphone database for Korean text-to-speech synthesis system

被引：0

作者：

Yoon, K ^{[1
]}

机构：

[1] Ohio State Univ, Columbus, OH 43220 USA

来源：

COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING | 2005年 / 3406卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a prosodically conditioned diphone database to be used in a Korean text-to-speech (TTS) synthesis system. The diphones are prosodically conditioned in the sense that a single conventional diphone is stored as different versions taken directly from the different prosodic domains of the prosodically labeled, read sentences (following the K-ToBI prosodic labeling conventions [3]). Four levels of the Korean prosodic domains were observed in the diphone selection process, thereby selecting four different versions of each diphone. A 400-sentence subset of the Korean Newswire Text Corpora [5] were converted to its pronounced form as described in [8] and its read version was prosodically labeled. The greedy algorithm [7] identified 223 sentences containing 1,853 prosodic diphones (out of the 3,977 possible prosodic diphones) that can synthesize all four hundred utterances. Although our system cannot synthesize an unlimited number of sentences at this stage, the quality of the synthesized sentences strongly suggests that it is a viable option to use prosodically conditioned diphones in a text-to-speech synthesis system.

引用

页码：425 / 428

页数：4

共 50 条

[1] A prosodic phrasing model for a Korean text-to-speech synthesis system
Yoon, K
[J]. COMPUTER SPEECH AND LANGUAGE, 2006, 20 (01): : 69 - 79
[2] Diphone Databases for Lithuanian text-to-speech synthesis
Kasparaitis, P
[J]. INFORMATICA, 2005, 16 (02) : 193 - 202
[3] Building diphone database for Arabic text to speech synthesis system
El Kadhi, Aymen
Gherri, Fadhila
Amiri, Hamid
[J]. 3RD INTERNATIONAL CONFERENCE ON CONTROL, ENGINEERING & INFORMATION TECHNOLOGY (CEIT 2015), 2015,
[4] Diphone Spanish Text-to-Speech Synthesizer
Rybarova, Renata
del Corral, Gonzalo
Rozinaj, Gregor
[J]. 2015 INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP 2015), 2015, : 121 - 124
[5] A prosodic model for text-to-speech synthesis in French
Di Cristo, A
Di Cristo, P
Campione, E
Véronis, J
[J]. INTONATION: ANALYSIS, MODELLING AND TECHNOLOGY, 2000, 15 : 321 - 355
[6] A Prosodic Text-to-Speech System for Yoruba Language
Akinwonmi, Akintoba Emmanuel
Alese, Boniface Kayode
[J]. 2013 8TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2013, : 630 - 635
[7] Prosodic Annotation in a Thai Text-to-speech System
Potisuk, Siripong
[J]. PACLIC 21: THE 21ST PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, PROCEEDINGS, 2007, : 405 - 414
[8] Prosodic annotation in a Thai Text-to-speech system
Department of Electrical and Computer Engineering, Citadel, Military College of South Carolina, 171 Moultrie Street, Charleston, SC 29409, United States
[J]. PACLIC - Pacific Asia Conf. Lang., Inf. Comput., Proc., 2007, (405-414):
[9] Speech synthesis for text-to-speech alignment and prosodic feature extraction
Malfrere, F
Dutoit, T
[J]. ISCAS '97 - PROCEEDINGS OF 1997 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS I - IV: CIRCUITS AND SYSTEMS IN THE INFORMATION AGE, 1997, : 2637 - 2640
[10] A Novel Quasi-Diphone Inventory Approach to Text-To-Speech Synthesis
Gerazov, Branislav
Shutinoski, Goce
Arsov, Goce
[J]. 2008 IEEE MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, VOLS 1 AND 2, 2008, : 778 - 783

← 1 2 3 4 5 →