Voice conversion using structured Gaussian mixture model in eigen space

被引:0
|
作者
Li, Yangchun [1 ]
Yu, Yibiao [1 ]
机构
[1] School of Electronic and Information Engineering, Soochow University, Suzhou,215006, China
来源
Shengxue Xuebao/Acta Acustica | 2015年 / 40卷 / 01期
关键词
Speech processing - Vector spaces - Gaussian distribution;
D O I
暂无
中图分类号
学科分类号
摘要
Under the condition of non-parallel corpora without joint training, a new methodology of voice conversion in eigen space based on structured Gaussian mixture model is proposed. For every speaker, after the cepstrum feature parameters are extracted, they are further mapped to the eigen space which is formed by eigen vectors of scatter matrix of the cepstrum features, then train speaker's Structured Gaussian Mixture Model in the Eigen Space (SGMM-ES). The source and target speaker's SGMM-ES are trained respectively, then based on Acoustic Universal Structure (AUS) principle to achieve spectrum transform function. Experimental results show the correct recognition average rate of conversion speech achieves 95.25%, and the value of average spectral distortion is 1.25, in terms of relative SGMM method increased by 0.8% and 7.3%. ABX and MOS evaluations indicate the conversion performances are quite close to the traditional method under the parallel corpora condition. The results show the eigen space based on structured Gaussian mixture model for voice conversion under the non-parallel corpora is effective. ©, 2015, Science Press. All right reserved.
引用
收藏
页码:12 / 19
相关论文
共 50 条
  • [1] Voice conversion using structured Gaussian mixture model in cepstrum eigenspace
    LI Yangchun
    YU Yibiao
    [J]. Chinese Journal of Acoustics, 2015, 34 (03) : 325 - 336
  • [2] Voice Conversion Using Structrued Gaussian Mixture Model
    Zeng, Daojian
    Yu, Yibiao
    [J]. 2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 541 - 544
  • [3] Voice conversion algorithm using phoneme Gaussian mixture model
    Sheng, L
    Yin, JX
    Huang, JC
    [J]. PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2004, : 5 - 8
  • [4] Voice Conversion Using Gaussian Mixture Models
    D'souza, Kevin
    Talele, K. T. V.
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMMUNICATION, INFORMATION & COMPUTING TECHNOLOGY (ICCICT), 2015,
  • [5] Voice conversion using Viterbi algorithm based on Gaussian mixture model
    Jian Zhi-Hua
    Yang Zhen
    [J]. 2007 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2007, : 40 - 43
  • [6] Efficient Gaussian Mixture Model Evaluation in Voice Conversion
    Tian, Jilei
    Nurminen, Jani
    Popa, Victor
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2282 - 2285
  • [7] Voice conversion using canonical correlation analysis based on Gaussian mixture model
    Jian, ZhiHua
    Yang, Zhen
    [J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 1, PROCEEDINGS, 2007, : 210 - +
  • [8] VOICE CONVERSION BASED ON MATRIX VARIATE GAUSSIAN MIXTURE MODEL
    Saito, Daisuke
    Doi, Hidenobu
    Minematsu, Nobuaki
    Hirose, Keikichi
    [J]. 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 567 - 571
  • [9] Contribution on Gaussian Mixture Model Order Determination for Voice Conversion
    Ben Amara, Ahmed
    Ben Jebara, Sofia
    [J]. 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL, IMAGE, VIDEO AND COMMUNICATIONS (ISIVC 2018), 2018, : 87 - 92
  • [10] Phoneme-based spectral voice conversion using temporal decomposition and Gaussian mixture model
    Nguyen, Binh Phu
    Akagi, Masato
    [J]. 2008 SECOND INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS, 2008, : 222 - 227