A Study of Mutual Information for GMM-Based Spectral Conversion

被引:0
|
作者
Hwang, Hsin-Te [1 ]
Tsao, Yu
Wang, Hsin-Min
Wang, Yih-Ru [1 ]
Chen, Sin-Horng [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Elect Engn, Hsinchu, Taiwan
关键词
Voice conversion; mutual information; GMM; VOICE CONVERSION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Gaussian mixture model (GMM)-based method has dominated the field of voice conversion (VC) for last decade. However, the converted spectra are excessively smoothed and thus produce muffled converted sound. In this study, we improve the speech quality by enhancing the dependency between the source (natural sound) and converted feature vectors (converted sound). It is believed that enhancing this dependency can make the converted sound closer to the natural sound. To this end, we propose an integrated maximum a posteriori and mutual information (MAPMI) criterion for parameter generation on spectral conversion. Experimental results demonstrate that the quality of converted speech by the proposed MAPMI method outperforms that by the conventional method in terms of formal listening test.
引用
收藏
页码:78 / 81
页数:4
相关论文
共 50 条
  • [1] EXPLORING MUTUAL INFORMATION FOR GMM-BASED SPECTRAL CONVERSION
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    [J]. 2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 50 - 54
  • [2] Experimental Study on GMM-Based Speaker Recognition
    Ye, Wenxing
    Wu, Dapeng
    Nucci, Antonio
    [J]. MOBILE MULTIMEDIA/IMAGE PROCESSING, SECURITY, AND APPLICATIONS 2010, 2010, 7708
  • [3] GMM-BASED SIGNIFICANCE DECODING
    Abdelaziz, Ahmed Hussen
    Zeiler, Steffen
    Kolossa, Dorothea
    Leutnant, Volker
    Haeb-Umbach, Reinhold
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6827 - 6831
  • [4] Incorporating Global Variance in the Training Phase of GMM-based Voice Conversion
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [5] GMM-BASED ITERATIVE ENTROPY CODING FOR SPECTRAL ENVELOPES OF SPEECH AND AUDIO
    Korse, Srikanth
    Fuchs, Guillaume
    Backstrom, Tom
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5689 - 5693
  • [6] GMM-Based Speaker Gender and Age Classification After Voice Conversion
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    [J]. 2016 FIRST INTERNATIONAL WORKSHOP ON SENSING, PROCESSING AND LEARNING FOR INTELLIGENT MACHINES (SPLINE), 2016,
  • [7] Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion
    Tanaka, Kei
    Hara, Sunao
    Abe, Masanobu
    Minagi, Shogo
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [8] Voice Conversion Using Bilinear Model Integrated with Joint GMM-based Classification
    Sun, Xinjian
    Zhang, Xiongwei
    Yang, Jibin
    Cao, Tieyong
    [J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2013, : 1225 - 1228
  • [9] Multi-frame GMM-based block quantisation of line spectral frequencies
    So, S
    Paliwal, KK
    [J]. SPEECH COMMUNICATION, 2005, 47 (03) : 265 - 276
  • [10] Modulation Spectrum-Based Post-Filter for GMM-Based Voice Conversion
    Takamichi, Shinnosuke
    Toda, Tomoki
    Black, Alan W.
    Nakamura, Satoshi
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,