On the Role of Spectral Dynamics in Unit Selection Speech Synthesis

被引：0

作者：

Kirkpatrick, Barry ^{[1
]}

O'Brien, Darragh ^{[1
]}

Scaife, Ronan ^{[1
]}

Errity, Andrew ^{[1
]}

机构：

[1] Dublin City Univ, Fac Engn & Comp, Res Inst Networks & Commun Engn, Dublin 9, Ireland

来源：

INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年

关键词：

speech synthesis; join costs; auditory perception; spectral dynamics; feature extraction;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Cost functions employed in unit selection significantly influence the quality of speech output. Although unit selection can produce very natural sounding speech the quality can be inconsistent and is difficult to guarantee due to discontinuities between incompatible units. The join cost employed in unit selection to measure the suitability of concatenating speech units typically consists of sub costs representing the fundamental frequency and spectrum at the boundaries of each unit. In this study the role of spectral dynamics as a join cost in unit selection synthesis is explored. A number of spectral dynamic measures are tested for the task of detecting discontinuities. Results indicate that spectral dynamic measures correlate with human perception of discontinuity if the features are extracted appropriately. Spectral dynamic mismatch is found to be a source of discontinuity although results suggest this is likely to occur simultaneously with static spectral mismatch.

引用

页码：2029 / 2032

页数：4

共 50 条

[41] Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech
Barra-Chicote, Roberto
Yamagishi, Junichi
King, Simon
Manuel Montero, Juan
Macias-Guarasa, Javier
SPEECH COMMUNICATION, 2010, 52 (05) : 394 - 404
[42] Globally optimal training of unit boundaries in unit selection text-to-speech synthesis
Bellegarda, Jerome R.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 957 - 965
[43] Automatic Segmentation Quality Improvement for Realization of Unit Selection Speech Synthesis
Szklarmy, Krzysztof
Wojtowski, Michal
2008 CONFERENCE ON HUMAN SYSTEM INTERACTIONS, VOLS 1 AND 2, 2008, : 245 - 250
[44] Evaluation of Finnish Unit Selection and HMM-based Speech Synthesis
Silen, Hanna
Helander, Elina
Nurminen, Jani
Gabbouji, Moncef
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1853 - +
[45] Learned dictionaries for sparse representation based unit selection speech synthesis
Sharma, Pulkit
Abrol, Vinayak
Sao, Anil Kumar
2016 TWENTY SECOND NATIONAL CONFERENCE ON COMMUNICATION (NCC), 2016,
[46] A statistical method for database reduction for embedded unit selection speech synthesis
Tsiakoulis, Pirros
Chalamandaris, Aimilios
Karabetsos, Sotiris
Raptis, Spyros
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4601 - 4604
[47] Maximum Likelihood Unit Selection for Corpus-based Speech Synthesis
Gamboa Rosales, Abubeker
Rosales, Hamurabi Gamboa
Hoffmann, Ruediger
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 748 - +
[48] Concatenative speech synthesis based on the plural unit selection and fusion method
Mizutani, T
Kagoshima, T
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (11): : 2565 - 2572
[49] An Overview of the ILSP Unit Selection Text-to-Speech Synthesis System
Tsiakoulis, Pirros
Karabetsos, Sotiris
Chalamandaris, Aimilios
Raptis, Spyros
ARTIFICIAL INTELLIGENCE: METHODS AND APPLICATIONS, 2014, 8445 : 370 - 383
[50] Continuity Metric for Unit Selection based Text-to-Speech Synthesis
Lakkavalli, Vikram Ramesh
Arulmozhi, P.
Ramakrishnan, A. G.
2010 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2010,

← 1 2 3 4 5 →