A Prosodic Mandarin Text-to-Speech System Based on Tacotron

被引:0
|
作者
Zhang, Chuxiong [1 ]
Zhang, Sheng [2 ]
Zhong, Haibing [2 ]
机构
[1] Duke Kunshan Univ, Data Sci Res Ctr, Suzhou, Jiangsu, Peoples R China
[2] Jiangsu Jinling Sci & Technol Grp Ltd, Nanjing, Jiangsu, Peoples R China
关键词
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The Tacotron performs well in English speech synthesis and successfully aligns two arbitrary sequences from different domain in an automatic way. However, to introduce Tacotron into Mandarin Chinese Text-to-Speech (TTS), a prosody system is needed for generating more natural speech. This paper proposes a practical method to involve the prosodic annotation into Tacotron training for Mandarin Chinese synthesis system. A prosody model predicting the prosodic boundaries from the given text serves as the front-end system in our approach, followed by a Tacotron synthesis system trained with well-labeled TTS database containing the prosodic annotations. Under subjective evaluation in terms of the prosody, results show that the synthesis system performs better by adding the prosodic system as the front-end system for Tacotron.
引用
收藏
页码:165 / 169
页数:5
相关论文
共 50 条
  • [1] Myanmar Text-to-Speech System based on Tacotron-2
    Win, Yuzana
    Masada, Tomonari
    [J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 578 - 583
  • [2] An RNN-based prosodic information synthesizer for Mandarin text-to-speech
    Chen, SH
    Hwang, SH
    Wang, YR
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03): : 226 - 239
  • [3] A Mandarin text-to-speech system
    Hwang, SH
    Chen, SH
    Wang, YR
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1421 - 1424
  • [4] Text normalization in mandarin Text-to-Speech system
    Jia, Yuxiang
    Huang, Dezhi
    Liu, Wu
    Dong, Yuan
    Yu, Shiwen
    Wang, Haila
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4693 - +
  • [5] Automatic conversion from lexical words to prosodic words for mandarin text-to-speech system
    Shao, Yanqiu
    Han, Jiqing
    Liu, Ting
    Zhao, Yongzhen
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2007, 10 (01) : 45 - 55
  • [6] A Prosodic Text-to-Speech System for Yoruba Language
    Akinwonmi, Akintoba Emmanuel
    Alese, Boniface Kayode
    [J]. 2013 8TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2013, : 630 - 635
  • [7] Prosodic annotation in a Thai Text-to-speech system
    Department of Electrical and Computer Engineering, Citadel, Military College of South Carolina, 171 Moultrie Street, Charleston, SC 29409, United States
    [J]. PACLIC - Pacific Asia Conf. Lang., Inf. Comput., Proc, 2007, (405-414):
  • [8] Prosodic Annotation in a Thai Text-to-speech System
    Potisuk, Siripong
    [J]. PACLIC 21: THE 21ST PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, PROCEEDINGS, 2007, : 405 - 414
  • [9] Lombard Speech Synthesis using Transfer Learning in a Tacotron Text-to-Speech System
    Bollepalli, Bajibabu
    Juvela, Lauri
    Alku, Paavo
    [J]. INTERSPEECH 2019, 2019, : 2833 - 2837
  • [10] An HMM-based Mandarin Chinese Text-to-Speech system
    Qian, Yao
    Soong, Frank
    Chen, Yining
    Chu, Min
    [J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 223 - +