A high quality text-to-speech system composed of multiple neural networks

被引:0
|
作者
Karaali, O [1 ]
Corrigan, G [1 ]
Massey, N [1 ]
Miller, C [1 ]
Schnurr, O [1 ]
Mackie, A [1 ]
机构
[1] Motorola Inc, Speech Proc Lab, Schaumburg, IL 60196 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
While neural networks have been employed to handle several different text-to-speech tasks, ours is the first system to use neural networks throughout, for both linguistic and acoustic processing. We divide the text-to-speech task into three subtasks, a linguistic module mapping from text to a linguistic representation, an acoustic module mapping from the linguistic representation to speech, and a video module mapping from the linguistic representation to animated images. The linguistic module employs a letter-to-sound neural network and a postlexical neural network. The acoustic module employs a duration neural network and a phonetic neural network. The visual neural network is employed in parallel to the acoustic module to drive a talking head. The use of neural networks that can be retrained on the characteristics of different voices and languages affords our system a degree of adaptability and naturalness heretofore unavailable.
引用
收藏
页码:1237 / 1240
页数:4
相关论文
共 50 条
  • [31] Study on Cantonese text-to-speech system
    Long, Qinghua
    Jing, Huisheng
    Ren, Ping
    Situ, Xikang
    Shengxue Xuebao/Acta Acustica, 1993, 18 (02): : 143 - 147
  • [32] COSEGMENTATION IN THE IBM TEXT-TO-SPEECH SYSTEM
    PICKERING, JB
    PROCEEDINGS : INSTITUTE OF ACOUSTICS, VOL 8, PART 7: SPEECH & HEARING, 1986, 8 : 385 - 392
  • [33] TOWARD AN ARABIC TEXT-TO-SPEECH SYSTEM
    AHMED, ME
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 1991, 16 (04): : 565 - 583
  • [34] Selection of the most significant parameters for duration modelling in a Spanish text-to-speech system using neural networks
    Córdoba, R
    Montero, JM
    Gutiérrez, JM
    Vallejo, JA
    Enriquez, E
    Pardo, JM
    COMPUTER SPEECH AND LANGUAGE, 2002, 16 (02): : 183 - 203
  • [35] Slovenian text-to-speech system GOVOREC
    Šef, Tomaž
    Elektrotehniski Vestnik/Electrotechnical Review, 2002, 69 (3-4): : 165 - 170
  • [36] TTTS: TURKISH TEXT-TO-SPEECH SYSTEM
    Gormez, Zeliha
    Orhan, Zeynep
    PROCEEDINGS OF THE 12TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS , PTS 1-3: NEW ASPECTS OF COMPUTERS, 2008, : 977 - +
  • [37] Japanese Text-to-Speech Conversion System
    1600, (The International Society for Computers and Their Applications (ISCA)):
  • [38] Whistler: A trainable text-to-speech system
    Huang, XD
    Acero, A
    Adcock, J
    Hon, HW
    Goldsmith, J
    Liu, JS
    Plumpe, M
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2387 - 2390
  • [39] TEXT-TO-SPEECH TRANSLATION SYSTEM FOR ITALIAN
    LESMO, L
    MEZZALAMA, M
    TORASSO, P
    INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1978, 10 (05): : 569 - 591
  • [40] MORPHOPHONOLOGY IN THE CSTR TEXT-TO-SPEECH SYSTEM
    SHOCKEY, L
    PROCEEDINGS : INSTITUTE OF ACOUSTICS, VOL 8, PART 7: SPEECH & HEARING, 1986, 8 : 393 - 398