On the Role of Spectral Dynamics in Unit Selection Speech Synthesis

被引：0

作者：

Kirkpatrick, Barry ^{[1
]}

O'Brien, Darragh ^{[1
]}

Scaife, Ronan ^{[1
]}

Errity, Andrew ^{[1
]}

机构：

[1] Dublin City Univ, Fac Engn & Comp, Res Inst Networks & Commun Engn, Dublin 9, Ireland

来源：

INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年

关键词：

speech synthesis; join costs; auditory perception; spectral dynamics; feature extraction;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Cost functions employed in unit selection significantly influence the quality of speech output. Although unit selection can produce very natural sounding speech the quality can be inconsistent and is difficult to guarantee due to discontinuities between incompatible units. The join cost employed in unit selection to measure the suitability of concatenating speech units typically consists of sub costs representing the fundamental frequency and spectrum at the boundaries of each unit. In this study the role of spectral dynamics as a join cost in unit selection synthesis is explored. A number of spectral dynamic measures are tested for the task of detecting discontinuities. Results indicate that spectral dynamic measures correlate with human perception of discontinuity if the features are extracted appropriately. Spectral dynamic mismatch is found to be a source of discontinuity although results suggest this is likely to occur simultaneously with static spectral mismatch.

引用

页码：2029 / 2032

页数：4

共 50 条

[21] Spectral dynamics as a source of discontinuity in concatenative speech synthesis
Kirkpatrick, Barry
O'Brien, Darragh
Scaife, Ronan
Errity, Andrew
PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, 2007, : 615 - +
[22] Speech unit selection based on target values driven by speech data in concatenative speech synthesis
Hirai, T
Tenpaku, S
Shikano, K
PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 43 - 46
[23] Minimum unit selection error training for HMM-based unit selection speech synthesis system
Ling, Zhen-Hua
Wang, Ren-Hua
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3949 - 3952
[24] Building of a Speech Corpus Optimised for Unit Selection TTS Synthesis
Matousek, Jindrich
Tihelka, Daniel
Romportl, Jan
SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1296 - 1299
[25] A comparison of unit selection techniques in limited domain speech synthesis
Batusek, R
Gaura, P
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 251 - 256
[26] Optimizing Phonetic Encoding for Viennese Unit Selection Speech Synthesis
Pucher, Michael
Neubarth, Friedrich
Strom, Volker
DEVELOPMENT OF MULTIMODAL INTERFACES: ACTIVE LISTING AND SYNCHRONY, 2010, 5967 : 207 - +
[27] Unit Selection based Speech Synthesis for Poor Channel Condition
Cen, Ling
Dong, Minghui
Chan, Paul
Li, Haizhou
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2035 - 2038
[28] Trainable unit selection speech synthesis under statistical framework
WANG RenHua
Science Bulletin, 2009, (11) : 1963 - 1969
[29] Triphone based unit selection for concatenative visual speech synthesis
Huang, FJ
Cosatto, E
Graf, HP
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 2037 - 2040
[30] A RESEARCH BED FOR UNIT SELECTION BASED TEXT TO SPEECH SYNTHESIS
Sarathy, K. Partha
Ramakrishnan, A. G.
2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 229 - +

← 1 2 3 4 5 →