A Prosodic Mandarin Text-to-Speech System Based on Tacotron

被引：0

作者：

Zhang, Chuxiong ^{[1
]}

Zhang, Sheng ^{[2
]}

Zhong, Haibing ^{[2
]}

机构：

[1] Duke Kunshan Univ, Data Sci Res Ctr, Suzhou, Jiangsu, Peoples R China

[2] Jiangsu Jinling Sci & Technol Grp Ltd, Nanjing, Jiangsu, Peoples R China

来源：

2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2019年

关键词：

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

The Tacotron performs well in English speech synthesis and successfully aligns two arbitrary sequences from different domain in an automatic way. However, to introduce Tacotron into Mandarin Chinese Text-to-Speech (TTS), a prosody system is needed for generating more natural speech. This paper proposes a practical method to involve the prosodic annotation into Tacotron training for Mandarin Chinese synthesis system. A prosody model predicting the prosodic boundaries from the given text serves as the front-end system in our approach, followed by a Tacotron synthesis system trained with well-labeled TTS database containing the prosodic annotations. Under subjective evaluation in terms of the prosody, results show that the synthesis system performs better by adding the prosodic system as the front-end system for Tacotron.

引用

页码：165 / 169

页数：5

共 50 条

[1] Myanmar Text-to-Speech System based on Tacotron-2
Win, Yuzana
Masada, Tomonari
[J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 578 - 583
[2] An RNN-based prosodic information synthesizer for Mandarin text-to-speech
Chen, SH
Hwang, SH
Wang, YR
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03): : 226 - 239
[3] A Mandarin text-to-speech system
Hwang, SH
Chen, SH
Wang, YR
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1421 - 1424
[4] Text normalization in mandarin Text-to-Speech system
Jia, Yuxiang
Huang, Dezhi
Liu, Wu
Dong, Yuan
Yu, Shiwen
Wang, Haila
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4693 - +
[5] Automatic conversion from lexical words to prosodic words for mandarin text-to-speech system
Shao, Yanqiu
Han, Jiqing
Liu, Ting
Zhao, Yongzhen
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2007, 10 (01) : 45 - 55
[6] A Prosodic Text-to-Speech System for Yoruba Language
Akinwonmi, Akintoba Emmanuel
Alese, Boniface Kayode
[J]. 2013 8TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2013, : 630 - 635
[7] Prosodic annotation in a Thai Text-to-speech system
Department of Electrical and Computer Engineering, Citadel, Military College of South Carolina, 171 Moultrie Street, Charleston, SC 29409, United States
[J]. PACLIC - Pacific Asia Conf. Lang., Inf. Comput., Proc, 2007, (405-414):
[8] Prosodic Annotation in a Thai Text-to-speech System
Potisuk, Siripong
[J]. PACLIC 21: THE 21ST PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, PROCEEDINGS, 2007, : 405 - 414
[9] Lombard Speech Synthesis using Transfer Learning in a Tacotron Text-to-Speech System
Bollepalli, Bajibabu
Juvela, Lauri
Alku, Paavo
[J]. INTERSPEECH 2019, 2019, : 2833 - 2837
[10] An HMM-based Mandarin Chinese Text-to-Speech system
Qian, Yao
Soong, Frank
Chen, Yining
Chu, Min
[J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 223 - +

← 1 2 3 4 5 →