Improving the Quality of Standard GMM-Based Voice Conversion Systems by Considering Physically Motivated Linear Transformations

被引:0
|
作者
Zorila, Tudor-Catalin [1 ]
Erro, Daniel [1 ]
Hernaez, Inma [1 ]
机构
[1] POLITEHN Univ Bucharest UPB, Bucharest, Romania
关键词
voice conversion; Gaussian mixture models; dynamic frequency warping; amplitude scaling; linear transformation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new method to train traditional voice conversion functions based on Gaussian mixture models, linear transforms and cepstral parameterization. Instead of using statistical criteria, this method calculates a set of linear transforms that represent physically meaningful spectral modifications such as frequency warping and amplitude scaling. Our experiments indicate that the proposed training method leads to significant improvements in the average quality of the converted speech with respect to traditional statistical methods. This is achieved without modifying the input/output parameters or the shape of the conversion function.
引用
收藏
页码:30 / 39
页数:10
相关论文
共 27 条
  • [21] IMPROVING VOICE QUALITY OF HMM-BASED SPEECH SYNTHESIS USING VOICE CONVERSION METHOD
    Jiao, Yishan
    Xie, Xiang
    Na, Xingyu
    Tu, Ming
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [22] High-quality voice conversion system based on GMM statistical parameters and RBF neural network
    CHEN Xian-tong
    ZHANG Ling-hua
    [J]. The Journal of China Universities of Posts and Telecommunications, 2014, (05) : 68 - 75
  • [23] High-quality voice conversion system based on GMM statistical parameters and RBF neural network
    CHEN Xian-tong
    ZHANG Ling-hua
    [J]. TheJournalofChinaUniversitiesofPostsandTelecommunications., 2014, 21 (05) - 75+93
  • [24] Quality Improvement of Voice Conversion Systems Based on Trellis Structured Vector Quantization
    Eslami, Mahdi
    Sheikhzadeh, Hamid
    Sayadiyan, Abolghasem
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 672 - +
  • [25] High quality voice conversion through phoneme-based linear mapping functions with STRAIGHT for mandarin
    Liu, Kun
    Zhang, Jianping
    Yan, Yonghong
    [J]. FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 4, PROCEEDINGS, 2007, : 410 - 414
  • [26] A robust PID controller based on linear quadratic gaussian approach for improving frequency stability of power systems considering renewables
    Khamies, Mohamed
    Magdy, Gaber
    Ebeed, Mohamed
    Kamel, Salah
    [J]. ISA TRANSACTIONS, 2021, 117 : 118 - 138
  • [27] Improving peer coordination quality in mobile P2P networks considering peer awareness and group synchronization: Implementation and performance evaluation of two fuzzy-based systems
    Kolici, Vladi
    Liu Yi
    Qafzezi, Ermioni
    Elmazi, Donald
    Barolli, Leonard
    [J]. JOURNAL OF HIGH SPEED NETWORKS, 2020, 26 (01) : 27 - 39