A hidden Markov-model-based trainable speech synthesizer

被引:35
|
作者
Donovan, RE [1 ]
Woodland, PC [1 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
来源
COMPUTER SPEECH AND LANGUAGE | 1999年 / 13卷 / 03期
关键词
D O I
10.1006/csla.1999.0123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new approach to speech synthesis in which a set of cross-word decision-tree state-clustered context-dependent hidden Markov models are used to define a set of subphone units to be used in a concatenation synthesizer. The models, trees, waveform segments and other parameters representing each clustered state are obtained completely automatically through training on a 1 hour single-speaker continuous-speech database. During synthesis the required utterance, specified as a string of words of known phonetic pronounciation, is generated as a sequence of these clustered states using a TD-PSOLA waveform concatenation synthesizer. The system produces speech. which, though in a monotone, is both natural sounding and highly intelligible. A Modified Rhyme Test conducted to measure segmental intelligibility yielded a 5.0% error rate. The speech produced by the system mimics the voice of the speaker used to record the training database. The system can be retrained on a new voice in less than 48 hours, and has been successfully trained on four voices. (C) 1999 Academic Press.
引用
收藏
页码:223 / 241
页数:19
相关论文
共 50 条
  • [1] A hidden Markov model based visual speech synthesizer
    Williams, JJ
    Katsaggelos, AK
    Randolph, MA
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 2393 - 2396
  • [2] The Indonesian Language Speech Synthesizer Based on the Hidden Markov Model
    Jangtjik, Kevin Alfianto
    Lestari, Dessi Puji
    [J]. 2014 International Conference on Electrical Engineering and Computer Science (ICEECS), 2014, : 12 - 16
  • [3] Trainable speech synthesis with trended Hidden Markov Models
    Dines, J
    Sridharan, S
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 833 - 836
  • [4] EXPRESSIVE SPEECH IDENTIFICATIONS BASED ON HIDDEN MARKOV MODEL
    Lutfi, Syaheerah L.
    Montero, J. M.
    Barra-Chicote, R.
    Lucas-Cuesta, J. M.
    Gallardo-Antolin, A.
    [J]. HEALTHINF 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON HEALTH INFORMATICS, 2009, : 488 - +
  • [5] A DOMESTIC SPEECH RECOGNITION BASED ON HIDDEN MARKOV MODEL
    Tao, Jun
    Jiang, Xiaoxiao
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS, 2011, : 606 - 609
  • [6] HIDDEN MARKOV MODELING OF SPEECH BASED ON A SEMICONTINUOUS MODEL
    HUANG, XD
    JACK, MA
    [J]. ELECTRONICS LETTERS, 1988, 24 (01) : 6 - 7
  • [7] Hidden Markov model-based speech emotion recognition
    Schuller, B
    Rigoll, G
    Lang, M
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 1 - 4
  • [8] Hidden Markov model based part of speech tagger for Urdu
    School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen Graduate School, China
    [J]. Inf. Technol. J., 2007, 8 (1190-1198):
  • [9] LARGE VOCABULARY HIDDEN MARKOV MODEL BASED SPEECH RECOGNITION
    RIGOLL, G
    [J]. EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, 1990, 1 (01): : 37 - 42
  • [10] English speech recognition method based on Hidden Markov model
    Lv Cuiling
    [J]. 2016 INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA 2016), 2016, : 94 - 97