Spectral modification for concatenative speech synthesis

被引：0

作者：

Wouters, J ^{[1
]}

Macon, MW ^{[1
]}

机构：

[1] Oregon Grad Inst, Ctr Spoken Language Understanding, Beaverton, OR 97006 USA

来源：

2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI | 2000年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Concatenative synthesis can produce high-quality speech hut is limited to the allophonic variations and voice types that were captured in the database. It would be desirable to modify speech units to remove formant discontinuities and to create new speaking styles, such as hypo- or hyper-articulated speech. Unfortunately, manipulating the spectral structure often leads to degraded speech quality. We investigate two speech modification strategies, one based on inverse filtering and the other on sinusoidal modeling, and we explain their merits and shortcomings for changing the spectral envelope in speech. We then propose a method which uses sinusoidal modeling and represents the complex sinusoidal amplitudes by an all-pole model. The all-pole model approximates the sinusoidal spectrum well, both in the amplitude and in the phase domain. We use the sinusoidal + all-pole model to control the spectral envelope in recorded speech. High-quality modified speech is generated from the model using sinusoidal synthesis. A perceptual test was conducted, which shows that the model was effective at changing vowel identities and was preferable over residual excited LPC.

引用

页码：941 / 944

页数：4

共 50 条

[1] Control of spectral dynamics in concatenative speech synthesis
Wouters, J
Macon, MW
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (01): : 30 - 38
[2] Spectral dynamics as a source of discontinuity in concatenative speech synthesis
Kirkpatrick, Barry
O'Brien, Darragh
Scaife, Ronan
Errity, Andrew
[J]. PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, 2007, : 615 - +
[3] Sinusoidal plus all-pole modification based spectral smoothing for concatenative speech synthesis
Kang, H
Liu, WJ
[J]. Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05), 2005, : 194 - 198
[4] Statistical prediction of spectral discontinuities of speech in concatenative synthesis
Pablo Trivino, Manuel
Alias, Francesc
[J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2008, (40): : 67 - 74
[5] Feature Extraction for Spectral Continuity Measures in Concatenative Speech Synthesis
Kirkpatrick, Barry
O'Brien, Darragh
Scaife, Ronan
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1742 - 1745
[6] New objective distance measures for spectral discontinuities in concatenative speech synthesis
Vepa, J
King, S
Taylor, P
[J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 223 - 226
[7] SPEECH SEGMENT SELECTION FOR CONCATENATIVE SYNTHESIS BASED ON SPECTRAL DISTORTION MINIMIZATION
IWAHASHI, N
KAIKI, N
SAGISAKA, Y
[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1942 - 1948
[8] Inverse filter approach to pitch modification: Application to concatenative synthesis of female speech
Ansari, R
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1623 - 1626
[9] SET OF CONCATENATIVE UNITS FOR SPEECH SYNTHESIS
OLIVE, J
LIBERMAN, M
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 : S130 - S130
[10] On the detection of discontinuities in concatenative speech synthesis
Pantazis, Yannis
Stylianou, Yannis
[J]. PROGRESS IN NONLINEAR SPEECH PROCESSING, 2007, 4391 : 89 - +

← 1 2 3 4 5 →