A prosodic diphone database for Korean text-to-speech synthesis system

被引:0
|
作者
Yoon, K [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43220 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a prosodically conditioned diphone database to be used in a Korean text-to-speech (TTS) synthesis system. The diphones are prosodically conditioned in the sense that a single conventional diphone is stored as different versions taken directly from the different prosodic domains of the prosodically labeled, read sentences (following the K-ToBI prosodic labeling conventions [3]). Four levels of the Korean prosodic domains were observed in the diphone selection process, thereby selecting four different versions of each diphone. A 400-sentence subset of the Korean Newswire Text Corpora [5] were converted to its pronounced form as described in [8] and its read version was prosodically labeled. The greedy algorithm [7] identified 223 sentences containing 1,853 prosodic diphones (out of the 3,977 possible prosodic diphones) that can synthesize all four hundred utterances. Although our system cannot synthesize an unlimited number of sentences at this stage, the quality of the synthesized sentences strongly suggests that it is a viable option to use prosodically conditioned diphones in a text-to-speech synthesis system.
引用
收藏
页码:425 / 428
页数:4
相关论文
共 50 条
  • [1] A prosodic phrasing model for a Korean text-to-speech synthesis system
    Yoon, K
    [J]. COMPUTER SPEECH AND LANGUAGE, 2006, 20 (01): : 69 - 79
  • [2] Diphone Databases for Lithuanian text-to-speech synthesis
    Kasparaitis, P
    [J]. INFORMATICA, 2005, 16 (02) : 193 - 202
  • [3] Building diphone database for Arabic text to speech synthesis system
    El Kadhi, Aymen
    Gherri, Fadhila
    Amiri, Hamid
    [J]. 3RD INTERNATIONAL CONFERENCE ON CONTROL, ENGINEERING & INFORMATION TECHNOLOGY (CEIT 2015), 2015,
  • [4] Diphone Spanish Text-to-Speech Synthesizer
    Rybarova, Renata
    del Corral, Gonzalo
    Rozinaj, Gregor
    [J]. 2015 INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP 2015), 2015, : 121 - 124
  • [5] A prosodic model for text-to-speech synthesis in French
    Di Cristo, A
    Di Cristo, P
    Campione, E
    Véronis, J
    [J]. INTONATION: ANALYSIS, MODELLING AND TECHNOLOGY, 2000, 15 : 321 - 355
  • [6] A Prosodic Text-to-Speech System for Yoruba Language
    Akinwonmi, Akintoba Emmanuel
    Alese, Boniface Kayode
    [J]. 2013 8TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2013, : 630 - 635
  • [7] Prosodic Annotation in a Thai Text-to-speech System
    Potisuk, Siripong
    [J]. PACLIC 21: THE 21ST PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, PROCEEDINGS, 2007, : 405 - 414
  • [8] Prosodic annotation in a Thai Text-to-speech system
    Department of Electrical and Computer Engineering, Citadel, Military College of South Carolina, 171 Moultrie Street, Charleston, SC 29409, United States
    [J]. PACLIC - Pacific Asia Conf. Lang., Inf. Comput., Proc., 2007, (405-414):
  • [9] Speech synthesis for text-to-speech alignment and prosodic feature extraction
    Malfrere, F
    Dutoit, T
    [J]. ISCAS '97 - PROCEEDINGS OF 1997 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS I - IV: CIRCUITS AND SYSTEMS IN THE INFORMATION AGE, 1997, : 2637 - 2640
  • [10] A Novel Quasi-Diphone Inventory Approach to Text-To-Speech Synthesis
    Gerazov, Branislav
    Shutinoski, Goce
    Arsov, Goce
    [J]. 2008 IEEE MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, VOLS 1 AND 2, 2008, : 778 - 783