Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of straight spectrum

被引：0

作者：

Toda, T ^{[1
]}

Saruwatari, H ^{[1
]}

Shikano, K ^{[1
]}

机构：

[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Ikoma, Nara 6300101, Japan

来源：

2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING | 2001年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In the voice conversion algorithm based on the Gaussian Mixture Model (GMM) applied to STRAIGHT, quality of converted speech is degraded because the converted spectrum is exceedingly smoothed. In this paper, we propose the GMM-based algorithm with dynamic frequency warping to avoid the over-smoothing. We also propose an addition of the weighted residual spectrum, which is the difference between the GMM-based converted spectrum and the frequency-warped spectrum, to avoid the deterioration of conversion-accuracy on speaker individuality. Results of the evaluation experiments clarify that the converted speech quality is better than that of the GMM-based algorithm, and the conversion-accuracy on speaker individuality is the same as that of the GMM-based algorithm in the proposed method with the properly-weighted residual spectrum.

引用

页码：841 / 844

页数：4

共 50 条

[11] Voice conversion using canonical correlation analysis based on Gaussian mixture model
Jian, ZhiHua
Yang, Zhen
SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 1, PROCEEDINGS, 2007, : 210 - +
[12] Voice Conversion based on Continuous Frequency Warping and Magnitude Scaling
Ye, Yuhang
Lawlor, Bob
2017 28TH IRISH SIGNALS AND SYSTEMS CONFERENCE (ISSC), 2017,
[13] An Exemplar-Based Approach to Frequency Warping for Voice Conversion
Tian, Xiaohai
Lee, Siu Wa
Wu, Zhizheng
Chng, Eng Siong
Li, Haizhou
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (10) : 1863 - 1876
[14] Contribution on Gaussian Mixture Model Order Determination for Voice Conversion
Ben Amara, Ahmed
Ben Jebara, Sofia
9TH INTERNATIONAL SYMPOSIUM ON SIGNAL, IMAGE, VIDEO AND COMMUNICATIONS (ISIVC 2018), 2018, : 87 - 92
[15] Robust voice activity detection algorithm based on complex Gaussian mixture model
Lei, Jian-Jun
Yang, Zhen
Liu, Gang
Guo, Jun
Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2009, 42 (04): : 353 - 356
[16] A Voice Conversion System Based on the Harmonic plus Noise Excitation and Gaussian Mixture Model
Wu Lifang
Zhang Linghua
PROCEEDINGS OF THE 2012 SECOND INTERNATIONAL CONFERENCE ON INSTRUMENTATION & MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2012), 2012, : 1575 - 1578
[17] Speech Analysis/Synthesis by Gaussian Mixture Approximation of the Speech Spectrum for Voice Conversion
Amini, Jamal
Shahrebabaki, Abdoreza Sabzi
Shokouhi, Navid
Sheikhzadeh, Hamid
Raahemifa, Kaamran
Eslami, Mehdi
2013 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (IEEE ISSPIT 2013), 2013, : 428 - 433
[18] Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt
NTT Human Interface Lab, Kanagawa, Japan
Speech Commun, 2 (153-164):
[19] Bilingual Voice Conversion by Weighted Frequency Warping Based on Formant Space
Yun, Young-Sun
Ladner, Richard E.
TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 137 - 144
[20] Voice conversion using structured Gaussian mixture model in eigen space
Li, Yangchun
Yu, Yibiao
Shengxue Xuebao/Acta Acustica, 2015, 40 (01): : 12 - 19

← 1 2 3 4 5 →