Voice Conversion Based on Gaussian Mixture Modules with Minimum Distance Spectral Mapping

被引:0
|
作者
Jin, Gui [1 ]
Johnson, Michael T. [1 ]
Liu, Jia [1 ]
Lin, Xiaokang [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
关键词
Voice Conversion; Gaussian mixture models; frequency warping; point-to-point mapping;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Voice conversion (VC) is the task of modifying a source speaker's voice to match that of a specific target speaker. Traditional methods use Gaussian mixture models (GMM), but the converted speech quality is often badly degraded due to over-smoothing. More recent approaches such as Dynamic Frequency Warping (DFW) maintain more spectrum details during transformation, but require specific formant frequency estimates, with estimation errors resulting in poor similarity between source and target speakers. This paper proposes a new method for voice conversion called Minimum Distance Spectral Mapping (MDSM), based on a frequency-warped point-to-point mapping that robustly and accurately transforms formant frequencies while also maintaining spectral details. The proposed MDSM method uses a minimum distance alignment between source and target speakers, rather than direct formant estimates, which increases robustness and also preserves other spectral details such as formant bandwidth. Results show that the proposed method offers a good trade-off between voice quality and identity similarity, outperforming traditional GMM and DFW in both subjective and objective evaluations.
引用
收藏
页码:356 / 359
页数:4
相关论文
共 50 条
  • [1] Voice conversion based on Gaussian processes by using kernels modeling the spectral density with Gaussian mixture models
    Bao, Jingyi
    Xu, Ning
    [J]. MODERN PHYSICS LETTERS B, 2018, 32 (34-36):
  • [2] Phoneme-based spectral voice conversion using temporal decomposition and Gaussian mixture model
    Nguyen, Binh Phu
    Akagi, Masato
    [J]. 2008 SECOND INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS, 2008, : 222 - 227
  • [3] VOICE CONVERSION BASED ON MATRIX VARIATE GAUSSIAN MIXTURE MODEL
    Saito, Daisuke
    Doi, Hidenobu
    Minematsu, Nobuaki
    Hirose, Keikichi
    [J]. 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 567 - 571
  • [4] Voice Conversion Using Gaussian Mixture Models
    D'souza, Kevin
    Talele, K. T. V.
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMMUNICATION, INFORMATION & COMPUTING TECHNOLOGY (ICCICT), 2015,
  • [5] Voice conversion using Viterbi algorithm based on Gaussian mixture model
    Jian Zhi-Hua
    Yang Zhen
    [J]. 2007 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2007, : 40 - 43
  • [6] Voice Conversion Using Structrued Gaussian Mixture Model
    Zeng, Daojian
    Yu, Yibiao
    [J]. 2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 541 - 544
  • [7] Efficient Gaussian Mixture Model Evaluation in Voice Conversion
    Tian, Jilei
    Nurminen, Jani
    Popa, Victor
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2282 - 2285
  • [8] Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models
    Doi, Hironori
    Nakamura, Keigo
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2472 - 2482
  • [9] Voice conversion using canonical correlation analysis based on Gaussian mixture model
    Jian, ZhiHua
    Yang, Zhen
    [J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 1, PROCEEDINGS, 2007, : 210 - +
  • [10] A Voice Morphing Model Based on the Gaussian Mixture Model and Generative Topographic Mapping
    Rassam, Murad A.
    Almekhlafi, Rasha
    Alosaily, Eman
    Hassan, Haneen
    Hassan, Reem
    Saeed, Eman
    Alqershi, Elham
    [J]. EMERGING TRENDS IN INTELLIGENT COMPUTING AND INFORMATICS: DATA SCIENCE, INTELLIGENT INFORMATION SYSTEMS AND SMART COMPUTING, 2020, 1073 : 396 - 406