Statistical Pronunciation Adaptation for Spontaneous Speech Synthesis

被引：0

作者：

Qader, Raheel ^{[1
]}

Lecorve, Gwenole ^{[1
]}

Lolive, Damien ^{[1
]}

Tahon, Marie ^{[1
]}

Sebillot, Pascale ^{[2
]}

机构：

[1] Univ Rennes 1, ENSSAT, IRISA, Lannion, France

[2] INSA Rennes, IRISA, Rennes, France

来源：

TEXT, SPEECH, AND DIALOGUE, TSD 2017 | 2017年 / 10415卷

关键词：

Speech synthesis; Spontaneous speech; Pronunciation modeling; Statistical adaptation; Conditional random field; CORPUS;

D O I：

10.1007/978-3-319-64206-2_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To bring more expressiveness into text-to-speech systems, this paper presents a new pronunciation variant generation method which works by adapting standard, i. e., dictionary-based, pronunciations to a spontaneous style. Its strength and originality lie in exploiting a wide range of linguistic, articulatory and prosodic features, and in using a probabilistic machine learning framework, namely conditional random fields and phoneme-based n-gram models. Extensive experiments on the Buckeye corpus of English conversational speech demonstrate the effectiveness of the approach through objective and perceptual evaluations.

引用

页码：92 / 101

页数：10

共 50 条

[1] Modeling pronunciation variation for spontaneous speech synthesis
Werner, S
Wolff, M
Eichner, M
Hoffmann, R
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 673 - 676
[2] Statistical Transformation of Language and Pronunciation Models for Spontaneous Speech Recognition
Akita, Yuya
Kawahara, Tatsuya
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1539 - 1549
[3] Spontaneous Speech Synthesis by Pronunciation Variant Selection - A Comparison to Natural Speech
Werner, Steffen
Hoffman, Ruediger
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1561 - 1564
[4] VTLN ADAPTATION FOR STATISTICAL SPEECH SYNTHESIS
Saheer, Lakshmi
Garner, Philip N.
Dines, John
Liang, Hui
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4838 - 4841
[5] Towards spontaneous speech synthesis - LM based selection of pronunciation variants
Eichner, M
Werner, S
Wolff, M
Hoffmann, R
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 248 - 251
[6] Pronunciation variant selection for spontaneous speech synthesis listening effort as a quality parameter
Werner, Steffen
Wolff, Matthias
Hoffmann, Ruediger
[J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 857 - 860
[7] Pronunciation Modeling for Spontaneous Mandarin Speech Recognition
Yi Liu
Pascale Fung
[J]. International Journal of Speech Technology, 2004, 7 (2-3) : 155 - 172
[8] Finding Relevant Features for Statistical Speech Synthesis Adaptation
Bruneau, Pierrick
Parisot, Olivier
Mohammadi, Amir
Demiroglu, Cenk
Ghoniem, Mohammad
Tamisier, Thomas
[J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
[9] TESTING THE CONSISTENCY ASSUMPTION: PRONUNCIATION VARIANT FORCED ALIGNMENT IN READ AND SPONTANEOUS SPEECH SYNTHESIS
Dall, Rasmus
Brognaux, Sandrine
Richmond, Korin
Valentini-Botinhao, Cassia
Henter, Gustav Eje
Hirschberg, Julia
Yamagishi, Junichi
King, Simon
[J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5155 - 5159
[10] Statistical Analysis of the Prosodic Parameters of a Spontaneous Arabic Speech Corpus for Speech Synthesis
Ali, Ikbel Hadj
Mnasri, Zied
[J]. STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2016, 2016, 9918 : 57 - 67

← 1 2 3 4 5 →