Optimal weight tuning method for unit selection cost functions in syllable based text-to-speech synthesis

被引：8

作者：

Narendra, N. P. ^{[1
]}

Rao, K. Sreenivasa ^{[1
]}

机构：

[1] Indian Inst Technol Kharagpur, Sch Informat Technol, Kharagpur 721302, W Bengal, India

来源：

APPLIED SOFT COMPUTING | 2013年 / 13卷 / 02期

关键词：

Text-to-speech synthesis; Unit selection; Target cost; Concatenation cost; Tuning of weights; Genetic algorithm; SYNTHESIS SYSTEM;

D O I：

10.1016/j.asoc.2012.09.023

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a method for tuning the weights of unit selection cost functions in syllable based text-to-speech (TTS) synthesis system. In this work, unit selection cost functions, namely target cost and concatenation cost, are designed appropriate to syllables. The method tunes the weights in such a way that perceptual preference patterns are appropriately considered while selecting the units. The method uses genetic algorithm to derive the optimal weights. Fitness function is designed to map perceptual preference patterns into weights of unit selection cost functions. The effectiveness of proposed method is evaluated by both subjective and objective measures. From the results, it is observed that the derived optimal weights can synthesize good quality speech compared to manually tuned weights. (C) 2012 Published by Elsevier B.V.

引用

页码：773 / 781

页数：9

共 50 条

[1] Syllable specific unit selection cost functions for text-to-speech synthesis
Narendra, N.P.
Sreenivasa Rao, K.
[J]. ACM Transactions on Speech and Language Processing, 2012, 9 (03):
[2] Extracting user preferences by GTM for aiGA weight tuning in unit selection text-to-speech synthesis
Formiga, Lluis
Alias, Francese
[J]. COMPUTATIONAL AND AMBIENT INTELLIGENCE, 2007, 4507 : 654 - +
[3] Globally optimal training of unit boundaries in unit selection text-to-speech synthesis
Bellegarda, Jerome R.
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 957 - 965
[4] A Dynamic Cost Weighting Framework for Unit Selection Text-to-Speech Synthesis
Bellegarda, Jerome R.
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1455 - 1463
[5] Intensity Modeling for Syllable Based Text-to-Speech Synthesis
Reddy, V. Ramu
Rao, K. Sreenivasa
[J]. CONTEMPORARY COMPUTING, 2012, 306 : 106 - 117
[6] Continuity Metric for Unit Selection based Text-to-Speech Synthesis
Lakkavalli, Vikram Ramesh
Arulmozhi, P.
Ramakrishnan, A. G.
[J]. 2010 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2010,
[7] A Text-to-Speech Platform for Variable Length Optimal Unit Searching Using Perception Based Cost Functions
Minkyu Lee
Daniel P. Lopresti
Joseph P. Olive
[J]. International Journal of Speech Technology, 2003, 6 (4) : 347 - 356
[8] Efficient Unit-Selection in Text-to-Speech Synthesis
Mihelic, Ales
Gros, Jerneja Zganec
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2008, 5246 : 411 - 418
[9] Diphone-based unit selection for Catalan text-to-speech synthesis
Guaus, R
Iriondo, I
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2000, 1902 : 277 - 282
[10] An efficient unit-selection method for concatenative Text-to-speech synthesis systems
Gros, Jerneja Zganec
Zganec, Mario
[J]. Journal of Computing and Information Technology, 2008, 16 (01) : 69 - 78

← 1 2 3 4 5 →