Improving the Quality of Standard GMM-Based Voice Conversion Systems by Considering Physically Motivated Linear Transformations

被引:0
|
作者
Zorila, Tudor-Catalin [1 ]
Erro, Daniel [1 ]
Hernaez, Inma [1 ]
机构
[1] POLITEHN Univ Bucharest UPB, Bucharest, Romania
关键词
voice conversion; Gaussian mixture models; dynamic frequency warping; amplitude scaling; linear transformation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new method to train traditional voice conversion functions based on Gaussian mixture models, linear transforms and cepstral parameterization. Instead of using statistical criteria, this method calculates a set of linear transforms that represent physically meaningful spectral modifications such as frequency warping and amplitude scaling. Our experiments indicate that the proposed training method leads to significant improvements in the average quality of the converted speech with respect to traditional statistical methods. This is achieved without modifying the input/output parameters or the shape of the conversion function.
引用
收藏
页码:30 / 39
页数:10
相关论文
共 28 条
  • [1] Improving the computational performance of standard GMM-based voice conversion systems used in real-time applications
    Ben Othmane, Imen
    Di Martino, Joseph
    Ouni, Kais
    [J]. 2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, CONTROL, OPTIMIZATION AND COMPUTER SCIENCE (ICECOCS), 2018,
  • [2] Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech
    Nakamura, Keigo
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    [J]. SPEECH COMMUNICATION, 2012, 54 (01) : 134 - 146
  • [3] Incorporating Global Variance in the Training Phase of GMM-based Voice Conversion
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [4] GMM-Based Speaker Gender and Age Classification After Voice Conversion
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    [J]. 2016 FIRST INTERNATIONAL WORKSHOP ON SENSING, PROCESSING AND LEARNING FOR INTELLIGENT MACHINES (SPLINE), 2016,
  • [5] Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion
    Tanaka, Kei
    Hara, Sunao
    Abe, Masanobu
    Minagi, Shogo
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [6] Voice Conversion Using Bilinear Model Integrated with Joint GMM-based Classification
    Sun, Xinjian
    Zhang, Xiongwei
    Yang, Jibin
    Cao, Tieyong
    [J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2013, : 1225 - 1228
  • [7] Modulation Spectrum-Based Post-Filter for GMM-Based Voice Conversion
    Takamichi, Shinnosuke
    Toda, Tomoki
    Black, Alan W.
    Nakamura, Satoshi
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [8] Improving the Performance of GMM Based Voice Conversion Method
    Song, Peng
    Zhao, Li
    [J]. PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 436 - 440
  • [9] Alleviating the Over-Smoothing Problem in GMM-Based Voice Conversion with Discriminative Training
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3061 - 3065
  • [10] MODULATION SPECTRUM-CONSTRAINED TRAJECTORY TRAINING ALGORITHM FOR GMM-BASED VOICE CONVERSION
    Takamichi, Shinnosuke
    Toda, Tomoki
    Black, Alan W.
    Nakamura, Satoshi
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4859 - 4863