Paraphrase generation to improve Text-To-Speech Synthesis

被引：0

作者：

Putois, Ghislain ^{[1
]}

Chevelu, Jonathan ^{[1
]}

Boidin, Cedric ^{[1
]}

机构：

[1] Orange Labs, Lannion, France

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 | 2010年

关键词：

speech synthesis; statistical paraphrase; unit selection; synthesis costs; Europarl;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text-to-speech synthesizer systems are of overall good quality, especially when adapted to a specific task. Given this task and an adapted voice corpus, the message quality is mainly dependent on the wording used. This paper presents how a paraphrase generator can be used in synergy with a text-to-speech (TTs) synthesis system to improve its overall performances. Our system is composed of a paraphrase generator using a French-to-French corpus learnt on a bilingual aligned corpus, a TTS selector based on the unit selection cost, and a TTS synthesizer. We present an evaluation of the system, which highlights the need for systematic subjective evaluation.

引用

页码：198 / 201

页数：4

共 50 条

[21] Fast Griffin Lim based waveform generation strategy for text-to-speech synthesis
Ankit Sharma
Puneet Kumar
Vikas Maddukuri
Nagasai Madamshetti
K. G. Kishore
Sahit Sai Sriram Kavuru
Balasubramanian Raman
Partha Pratim Roy
[J]. Multimedia Tools and Applications, 2020, 79 : 30205 - 30233
[22] Fast Griffin Lim based waveform generation strategy for text-to-speech synthesis
Sharma, Ankit
Kumar, Puneet
Maddukuri, Vikas
Madamshetti, Nagasai
Kishore, K. G.
Kavuru, Sahit Sai Sriram
Raman, Balasubramanian
Roy, Partha Pratim
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (41-42) : 30205 - 30233
[23] A prosodic model for text-to-speech synthesis in French
Di Cristo, A
Di Cristo, P
Campione, E
Véronis, J
[J]. INTONATION: ANALYSIS, MODELLING AND TECHNOLOGY, 2000, 15 : 321 - 355
[24] FACTORIZED CONTEXT MODELLING FOR TEXT-TO-SPEECH SYNTHESIS
Lu, Heng
King, Simon
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7849 - 7853
[25] A stochastic model of intonation for text-to-speech synthesis
Véronis, J
Di Cristo, P
Courtois, F
Chaumette, C
[J]. SPEECH COMMUNICATION, 1998, 26 (04) : 233 - 244
[26] Database processing for Spanish text-to-speech synthesis
Gómez-Mena, J
Cardo, M
Madrid, JL
Prades, C
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2000, 1902 : 248 - 252
[27] RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
Zandie, Rohola
Mahoor, Mohammad H.
Madsen, Julia
Emamian, Eshrat S.
[J]. INTERSPEECH 2021, 2021, : 2751 - 2755
[28] ASSIGNMENT OF SEGMENTAL DURATION IN TEXT-TO-SPEECH SYNTHESIS
VANSANTEN, JPH
[J]. COMPUTER SPEECH AND LANGUAGE, 1994, 8 (02): : 95 - 128
[29] Text-to-speech synthesis with an Indian language perspective
Panda, Soumya Priyadarsini
Nayak, Ajit Kumar
Patnaik, Srikanta
[J]. INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2015, 6 (3-4) : 170 - 178
[30] Statistical Text-to-Speech Synthesis with Improved Dynamics
Tiomkin, Stas
Malah, David
[J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1841 - 1844

← 1 2 3 4 5 →