Statistical Pronunciation Adaptation for Spontaneous Speech Synthesis

被引:0
|
作者
Qader, Raheel [1 ]
Lecorve, Gwenole [1 ]
Lolive, Damien [1 ]
Tahon, Marie [1 ]
Sebillot, Pascale [2 ]
机构
[1] Univ Rennes 1, ENSSAT, IRISA, Lannion, France
[2] INSA Rennes, IRISA, Rennes, France
来源
关键词
Speech synthesis; Spontaneous speech; Pronunciation modeling; Statistical adaptation; Conditional random field; CORPUS;
D O I
10.1007/978-3-319-64206-2_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To bring more expressiveness into text-to-speech systems, this paper presents a new pronunciation variant generation method which works by adapting standard, i. e., dictionary-based, pronunciations to a spontaneous style. Its strength and originality lie in exploiting a wide range of linguistic, articulatory and prosodic features, and in using a probabilistic machine learning framework, namely conditional random fields and phoneme-based n-gram models. Extensive experiments on the Buckeye corpus of English conversational speech demonstrate the effectiveness of the approach through objective and perceptual evaluations.
引用
收藏
页码:92 / 101
页数:10
相关论文
共 50 条
  • [1] Modeling pronunciation variation for spontaneous speech synthesis
    Werner, S
    Wolff, M
    Eichner, M
    Hoffmann, R
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 673 - 676
  • [2] Statistical Transformation of Language and Pronunciation Models for Spontaneous Speech Recognition
    Akita, Yuya
    Kawahara, Tatsuya
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1539 - 1549
  • [3] Spontaneous Speech Synthesis by Pronunciation Variant Selection - A Comparison to Natural Speech
    Werner, Steffen
    Hoffman, Ruediger
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1561 - 1564
  • [4] VTLN ADAPTATION FOR STATISTICAL SPEECH SYNTHESIS
    Saheer, Lakshmi
    Garner, Philip N.
    Dines, John
    Liang, Hui
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4838 - 4841
  • [5] Towards spontaneous speech synthesis - LM based selection of pronunciation variants
    Eichner, M
    Werner, S
    Wolff, M
    Hoffmann, R
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 248 - 251
  • [6] Pronunciation variant selection for spontaneous speech synthesis listening effort as a quality parameter
    Werner, Steffen
    Wolff, Matthias
    Hoffmann, Ruediger
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 857 - 860
  • [7] Pronunciation Modeling for Spontaneous Mandarin Speech Recognition
    Yi Liu
    Pascale Fung
    [J]. International Journal of Speech Technology, 2004, 7 (2-3) : 155 - 172
  • [8] Finding Relevant Features for Statistical Speech Synthesis Adaptation
    Bruneau, Pierrick
    Parisot, Olivier
    Mohammadi, Amir
    Demiroglu, Cenk
    Ghoniem, Mohammad
    Tamisier, Thomas
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [9] TESTING THE CONSISTENCY ASSUMPTION: PRONUNCIATION VARIANT FORCED ALIGNMENT IN READ AND SPONTANEOUS SPEECH SYNTHESIS
    Dall, Rasmus
    Brognaux, Sandrine
    Richmond, Korin
    Valentini-Botinhao, Cassia
    Henter, Gustav Eje
    Hirschberg, Julia
    Yamagishi, Junichi
    King, Simon
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5155 - 5159
  • [10] Statistical Analysis of the Prosodic Parameters of a Spontaneous Arabic Speech Corpus for Speech Synthesis
    Ali, Ikbel Hadj
    Mnasri, Zied
    [J]. STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2016, 2016, 9918 : 57 - 67