Whistler: A trainable text-to-speech system

被引:0
|
作者
Huang, XD
Acero, A
Adcock, J
Hon, HW
Goldsmith, J
Liu, JS
Plumpe, M
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We introduce Whistler, a trainable Text-to-Speech (TTS) system, that automatically learns the model parameters from a corpus. Both prosody parameters and concatenative speech units are derived through the use of probabilistic learning methods that have been successfully used for speech recognition. Whistler can produce synthetic speech that sounds very natural and resembles the acoustic and prosodic characteristics oi the original speaker. The underlying technologies used in Whistler can significantly facilitate the process of creating generic TTS systems for a new language, a new voice, or a new speech style.
引用
收藏
页码:2387 / 2390
页数:4
相关论文
共 50 条
  • [1] Recent improvements on Microsoft's trainable Text-to-Speech system - Whistler
    Huang, X
    Acero, A
    Hon, H
    Ju, Y
    Liu, J
    Meredith, S
    Plumpe, M
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 959 - 962
  • [2] Trainable prosodic model for standard Chinese Text-to-Speech system
    TAO Jianhua
    [J]. Chinese Journal of Acoustics, 2001, (03) : 257 - 265
  • [3] Automatic generation of synthesis units for trainable text-to-speech systems
    Hon, H
    Acero, A
    Huang, X
    Liu, J
    Plumpe, M
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 293 - 296
  • [4] EFFICIENTLY TRAINABLE TEXT-TO-SPEECH SYSTEM BASED ON DEEP CONVOLUTIONAL NETWORKS WITH GUIDED ATTENTION
    Tachibana, Hideyuki
    Uenoyama, Katsuya
    Aihara, Shunsuke
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4784 - 4788
  • [5] Slovenian text-to-speech system
    Sef, T
    [J]. ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL V: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 41 - 44
  • [6] A TEXT-TO-SPEECH CONVERSION SYSTEM
    KLATT, DH
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1982, 184 (SEP): : 11 - CINF
  • [7] A Hakka text-to-speech system
    Yu, Hsiu-Min
    Hwang, Hsin-Te
    Lin, Dong-Yi
    Chen, Sin-Horng
    [J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 241 - +
  • [8] A Mandarin text-to-speech system
    Hwang, SH
    Chen, SH
    Wang, YR
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1421 - 1424
  • [9] Text analysis for the Slovenian text-to-speech system
    Sef, T
    [J]. ICECS 2001: 8TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS, VOLS I-III, CONFERENCE PROCEEDINGS, 2001, : 1355 - 1358
  • [10] Text normalization in mandarin Text-to-Speech system
    Jia, Yuxiang
    Huang, Dezhi
    Liu, Wu
    Dong, Yuan
    Yu, Shiwen
    Wang, Haila
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4693 - +