Feature Extraction for Spectral Continuity Measures in Concatenative Speech Synthesis

被引：0

作者：

Kirkpatrick, Barry ^{[1
]}

O'Brien, Darragh ^{[1
]}

Scaife, Ronan ^{[1
]}

机构：

[1] Dublin City Univ, Fac Engn & Comp, Dublin 9, Ireland

来源：

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年

关键词：

speech synthesis; unit selection; join cost; wavelet transform; phase spectra;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The quality of concatenative speech synthesis depends on the cost function employed for unit selection. Effective cost functions for spectral continuity are difficult to define and standard measures often do not accurately reflect human perception of discontinuity across a concatenated join. In this study the performance of a number of standard distance measures are compared for the task of detecting audible discontinuities in concatenated speech. Feature sets derived from. the phase spectrum are also investigated. Feature extraction based on wavelet analysis is proposed to overcome some of the limitations of the standard measures tested. Receiver Operating Characteristic (ROC) curves are constructed for each measure from the results of a perceptual experiment and are used to rank the performance of each measure. Results indicate that phase spectra is comparable to magnitude spectra as a join cost for spectral continuity. Measures based on wavelet transform coefficients outperform all other measures tested.

引用

页码：1742 / 1745

页数：4

共 50 条

[21] An evaluation of automatic phone segmentation for concatenative speech synthesis
Kawai, H
Toda, T
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 677 - 680
[22] Syllable Based Concatenative Synthesis for Text to Speech Conversion
Ananthi, S.
Dhanalakshmi, P.
[J]. COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 3, 2015, 33
[23] A concatenative speech synthesis for monosyllabic languages with limited data
Phung, Trung-Nghia
Luong, Mai Chi
Akagi, Masato
[J]. 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
[24] Selection in a concatenative speech synthesis system using a large speech database
Hunt, AJ
Black, AW
[J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 373 - 376
[25] Perceptual and objective detection of discontinuities in concatenative speech synthesis
Stylianou, Y
Syrdal, AK
[J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 837 - 840
[26] Challenges and rewards in using parametric or concatenative speech synthesis
Henton C.
[J]. International Journal of Speech Technology, 2002, 5 (02) : 117 - 131
[27] Six Approaches to Limited Domain Concatenative Speech Synthesis
Utama, Robert J.
Syrdal, Ann K.
Conkie, Alistair
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2058 - +
[28] Sigmoidal spectral conversion with changeable dynamic region for speech feature extraction
Oh, KC
Lee, HS
[J]. ELECTRONICS LETTERS, 1999, 35 (02) : 125 - 126
[29] Removing linear phase mismatches in concatenative speech synthesis
Stylianou, Y
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 232 - 239
[30] Introduction to Multilingual Corpus-Based Concatenative Speech Synthesis
Deprez, Filip
Odijk, Jan
De Moortel, Jan
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 357 - 360

← 1 2 3 4 5 →