A quantitative method for modeling context in concatenative synthesis using large speech database

被引：0

作者：

Hamza, W

Rashwan, M

Afify, M

机构：

来源：

2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM | 2001年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Modeling phonetic context is one of the key points to get natural sounding in concatenative speech synthesis. In this paper, a new quantitative method to model context has been proposed. In the proposed method, the context is measured as the distance between leafs of the top-down likelihood-based decision trees that have been grown during the construction of acoustic inventory. Unlike other context modeling methods, this method allows the unit selection algorithm to borrow unit occurrences from other contexts when their context distances are close. This is done by incorporating the measured distance as an element in the unit selection cost function. The motivation behind this method is that it reduces the required speech modification by using better unit occurrences from near context. This method also makes it easy to use long synthesis units, e.g. syllables or words, in the same unit selection framework.

引用

页码：789 / 792

页数：4

共 50 条

[1] Selection in a concatenative speech synthesis system using a large speech database
Hunt, AJ
Black, AW
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 373 - 376
[2] Context-adaptive smoothing for concatenative speech synthesis
Lee, KS
Kim, SR
IEEE SIGNAL PROCESSING LETTERS, 2002, 9 (12) : 422 - 425
[3] Speech Database Design for a Concatenative Text-to-Speech Synthesis System for Individuals with Communication Disorders
Akemi Iida
Nick Campbell
International Journal of Speech Technology, 2003, 6 (4) : 379 - 392
[4] Challenges and rewards in using parametric or concatenative speech synthesis
Henton C.
International Journal of Speech Technology, 2002, 5 (02) : 117 - 131
[5] Unit database pruning based on the cost degradation criterion for concatenative speech synthesis
Nishizawa, Nobuyuki
Kawai, Hisashi
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3969 - 3972
[6] Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis
Stylianou, Yannis
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 1 : 377 - 380
[7] Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis
Stylianou, Y
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 377 - 380
[8] Articulatory modeling: A possible role in concatenative text-to-speech synthesis
Sondhi, MM
PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 73 - 78
[9] Modeling consonant-context effects in a large database of spontaneous speech recordings
1600, Acoustical Society of America (142):
[10] Modeling consonant-context effects in a large database of spontaneous speech recordings
Kiefte, Michael
Nearey, Terrance M.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 142 (01): : 434 - 443

← 1 2 3 4 5 →