Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of straight spectrum

被引：0

作者：

Toda, T ^{[1
]}

Saruwatari, H ^{[1
]}

Shikano, K ^{[1
]}

机构：

[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Ikoma, Nara 6300101, Japan

来源：

2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING | 2001年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In the voice conversion algorithm based on the Gaussian Mixture Model (GMM) applied to STRAIGHT, quality of converted speech is degraded because the converted spectrum is exceedingly smoothed. In this paper, we propose the GMM-based algorithm with dynamic frequency warping to avoid the over-smoothing. We also propose an addition of the weighted residual spectrum, which is the difference between the GMM-based converted spectrum and the frequency-warped spectrum, to avoid the deterioration of conversion-accuracy on speaker individuality. Results of the evaluation experiments clarify that the converted speech quality is better than that of the GMM-based algorithm, and the conversion-accuracy on speaker individuality is the same as that of the GMM-based algorithm in the proposed method with the properly-weighted residual spectrum.

引用

页码：841 / 844

页数：4

共 50 条

[31] An Immittance Spectral Frequency parameters quantization Algorithm based on Gaussian Mixture Model
Wang Xiaochen
Zhang Yong
Hu Ruimin
Du Xi
MINES 2009: FIRST INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION NETWORKING AND SECURITY, VOL 1, PROCEEDINGS, 2009, : 324 - 328
[32] ON USING NON-LINEAR CANONICAL CORRELATION ANALYSIS FOR VOICE CONVERSION BASED ON GAUSSIAN MIXTURE MODEL
Jian Zhihua Yang Zhen(School of Communication Engineering
Journal of Electronics(China), 2010, 27 (01) : 1 - 7
[33] VOICE CONVERSION ALGORITHM-BASED ON PIECEWISE-LINEAR CONVERSION RULES OF FORMANT FREQUENCY AND SPECTRUM TILT
MIZUNO, H
ABE, M
SPEECH COMMUNICATION, 1995, 16 (02) : 153 - 164
[34] Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models
Doi, Hironori
Nakamura, Keigo
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2472 - 2482
[35] Voice Conversion Based on Gaussian Mixture Modules with Minimum Distance Spectral Mapping
Jin, Gui
Johnson, Michael T.
Liu, Jia
Lin, Xiaokang
2015 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2015, : 356 - 359
[36] NOVEL AMPLITUDE SCALING METHOD FOR BILINEAR FREQUENCY WARPING-BASED VOICE CONVERSION
Shah, Nirmesh J.
Patil, Hemant A.
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5520 - 5524
[37] Query by Voice Example and sound similarity based on the Dynamic Time Warping algorithm
Niewiadomy, Dominik
Pelikant, Adam
PRZEGLAD ELEKTROTECHNICZNY, 2010, 86 (08): : 143 - 146
[38] Gender based Voice Authentication Using Gaussian Mixture Model and Mel-Frequency Cepstrum Coefficients
Rajeh, Wahid
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (01): : 539 - 545
[39] Voice conversion based on Gaussian processes by using kernels modeling the spectral density with Gaussian mixture models
Bao, Jingyi
Xu, Ning
MODERN PHYSICS LETTERS B, 2018, 32 (34-36):
[40] A Voice Morphing Model Based on the Gaussian Mixture Model and Generative Topographic Mapping
Rassam, Murad A.
Almekhlafi, Rasha
Alosaily, Eman
Hassan, Haneen
Hassan, Reem
Saeed, Eman
Alqershi, Elham
EMERGING TRENDS IN INTELLIGENT COMPUTING AND INFORMATICS: DATA SCIENCE, INTELLIGENT INFORMATION SYSTEMS AND SMART COMPUTING, 2020, 1073 : 396 - 406

← 1 2 3 4 5 →