A quantitative method for modeling context in concatenative synthesis using large speech database

被引：0

作者：

Hamza, W

Rashwan, M

Afify, M

机构：

来源：

2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM | 2001年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Modeling phonetic context is one of the key points to get natural sounding in concatenative speech synthesis. In this paper, a new quantitative method to model context has been proposed. In the proposed method, the context is measured as the distance between leafs of the top-down likelihood-based decision trees that have been grown during the construction of acoustic inventory. Unlike other context modeling methods, this method allows the unit selection algorithm to borrow unit occurrences from other contexts when their context distances are close. This is done by incorporating the measured distance as an element in the unit selection cost function. The motivation behind this method is that it reduces the required speech modification by using better unit occurrences from near context. This method also makes it easy to use long synthesis units, e.g. syllables or words, in the same unit selection framework.

引用

页码：789 / 792

页数：4

共 50 条

[41] Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition
Pakoci, Edvin
Popovic, Branislav
Pekar, Darko
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2019, 2019
[42] Sinusoidal modeling using elliptic filter for analysis and synthesis of speech signals
Kim, Kihong
Ahn, Byeongho
Chung, Yongick
Nam, Taekjun
Yi, Sangyi
2006 SICE-ICASE INTERNATIONAL JOINT CONFERENCE, VOLS 1-13, 2006, : 6043 - +
[43] MYANMAR SPEECH SYNTHESIS SYSTEM BY USING PHONEME CONCATENATION METHOD
Hlaing, Chaw Su
Thida, Aye
PROCEEDINGS OF 2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICSPC'17), 2017, : 399 - 404
[44] A HMM Based Speech Synthesis Method Using Articulatory Feature
Li, Yong
Yin, Qing
PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 185 - 189
[45] Enhancement of Speech over Wireless Network using Sinusoidal Modeling and Synthesis
Arifianto, Dhany
2013 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2013, : 301 - 305
[46] Quantitative Modeling of Protein Synthesis Using Ribosome Profiling Data
Yadav, Vandana
Irshad, Inayat Ullah
Kumar, Hemant
Sharma, Ajeet K.
FRONTIERS IN MOLECULAR BIOSCIENCES, 2021, 8
[47] Evaluation of a speech synthesis method for nonlinear modeling of vocal folds vibration effect
Ohmura, H
Tanaka, K
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 935 - 938
[48] Tree-based Context Clustering Using Speech Recognition Features for Acoustic Model Training of Speech Synthesis
Chanjaradwichai, Supadaech
Suchato, Atiwong
Punyabukkana, Proadpran
2015 12TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2015,
[49] Duration Modeling for Text to Speech Synthesis System using Festival Speech Engine Developed for Malayalam Language
Rajan, Bindhu K.
Rijoy, V
Gopinath, Deepa P.
George, Nimmy
2015 INTERNATIONAL CONFERENCED ON CIRCUITS, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2015), 2015,
[50] High-quality speech synthesis using context-dependent syllabic units
Saito, T
Hashimoto, Y
Sakamoto, M
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 381 - 384

← 1 2 3 4 5 →