A high quality text-to-speech system composed of multiple neural networks

被引：0

作者：

Karaali, O ^{[1
]}

Corrigan, G ^{[1
]}

Massey, N ^{[1
]}

Miller, C ^{[1
]}

Schnurr, O ^{[1
]}

Mackie, A ^{[1
]}

机构：

[1] Motorola Inc, Speech Proc Lab, Schaumburg, IL 60196 USA

来源：

PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6 | 1998年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

While neural networks have been employed to handle several different text-to-speech tasks, ours is the first system to use neural networks throughout, for both linguistic and acoustic processing. We divide the text-to-speech task into three subtasks, a linguistic module mapping from text to a linguistic representation, an acoustic module mapping from the linguistic representation to speech, and a video module mapping from the linguistic representation to animated images. The linguistic module employs a letter-to-sound neural network and a postlexical neural network. The acoustic module employs a duration neural network and a phonetic neural network. The visual neural network is employed in parallel to the acoustic module to drive a talking head. The use of neural networks that can be retrained on the characteristics of different voices and languages affords our system a degree of adaptability and naturalness heretofore unavailable.

引用

页码：1237 / 1240

页数：4

共 50 条

[31] Study on Cantonese text-to-speech system
Long, Qinghua
Jing, Huisheng
Ren, Ping
Situ, Xikang
Shengxue Xuebao/Acta Acustica, 1993, 18 (02): : 143 - 147
[32] COSEGMENTATION IN THE IBM TEXT-TO-SPEECH SYSTEM
PICKERING, JB
PROCEEDINGS : INSTITUTE OF ACOUSTICS, VOL 8, PART 7: SPEECH & HEARING, 1986, 8 : 385 - 392
[33] TOWARD AN ARABIC TEXT-TO-SPEECH SYSTEM
AHMED, ME
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 1991, 16 (04): : 565 - 583
[34] Selection of the most significant parameters for duration modelling in a Spanish text-to-speech system using neural networks
Córdoba, R
Montero, JM
Gutiérrez, JM
Vallejo, JA
Enriquez, E
Pardo, JM
COMPUTER SPEECH AND LANGUAGE, 2002, 16 (02): : 183 - 203
[35] Slovenian text-to-speech system GOVOREC
Šef, Tomaž
Elektrotehniski Vestnik/Electrotechnical Review, 2002, 69 (3-4): : 165 - 170
[36] TTTS: TURKISH TEXT-TO-SPEECH SYSTEM
Gormez, Zeliha
Orhan, Zeynep
PROCEEDINGS OF THE 12TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS , PTS 1-3: NEW ASPECTS OF COMPUTERS, 2008, : 977 - +
[37] Japanese Text-to-Speech Conversion System
1600, (The International Society for Computers and Their Applications (ISCA)):
[38] Whistler: A trainable text-to-speech system
Huang, XD
Acero, A
Adcock, J
Hon, HW
Goldsmith, J
Liu, JS
Plumpe, M
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2387 - 2390
[39] TEXT-TO-SPEECH TRANSLATION SYSTEM FOR ITALIAN
LESMO, L
MEZZALAMA, M
TORASSO, P
INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1978, 10 (05): : 569 - 591
[40] MORPHOPHONOLOGY IN THE CSTR TEXT-TO-SPEECH SYSTEM
SHOCKEY, L
PROCEEDINGS : INSTITUTE OF ACOUSTICS, VOL 8, PART 7: SPEECH & HEARING, 1986, 8 : 393 - 398

← 1 2 3 4 5 →