A Study of Mutual Information for GMM-Based Spectral Conversion

被引：0

作者：

Hwang, Hsin-Te ^{[1
]}

Tsao, Yu

Wang, Hsin-Min

Wang, Yih-Ru ^{[1
]}

Chen, Sin-Horng ^{[1
]}

机构：

[1] Natl Chiao Tung Univ, Dept Elect Engn, Hsinchu, Taiwan

来源：

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 | 2012年

关键词：

Voice conversion; mutual information; GMM; VOICE CONVERSION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Gaussian mixture model (GMM)-based method has dominated the field of voice conversion (VC) for last decade. However, the converted spectra are excessively smoothed and thus produce muffled converted sound. In this study, we improve the speech quality by enhancing the dependency between the source (natural sound) and converted feature vectors (converted sound). It is believed that enhancing this dependency can make the converted sound closer to the natural sound. To this end, we propose an integrated maximum a posteriori and mutual information (MAPMI) criterion for parameter generation on spectral conversion. Experimental results demonstrate that the quality of converted speech by the proposed MAPMI method outperforms that by the conventional method in terms of formal listening test.

引用

页码：78 / 81

页数：4

共 50 条

[1] EXPLORING MUTUAL INFORMATION FOR GMM-BASED SPECTRAL CONVERSION
Hwang, Hsin-Te
Tsao, Yu
Wang, Hsin-Min
Wang, Yih-Ru
Chen, Sin-Horng
[J]. 2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 50 - 54
[2] Experimental Study on GMM-Based Speaker Recognition
Ye, Wenxing
Wu, Dapeng
Nucci, Antonio
[J]. MOBILE MULTIMEDIA/IMAGE PROCESSING, SECURITY, AND APPLICATIONS 2010, 2010, 7708
[3] GMM-BASED SIGNIFICANCE DECODING
Abdelaziz, Ahmed Hussen
Zeiler, Steffen
Kolossa, Dorothea
Leutnant, Volker
Haeb-Umbach, Reinhold
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6827 - 6831
[4] Incorporating Global Variance in the Training Phase of GMM-based Voice Conversion
Hwang, Hsin-Te
Tsao, Yu
Wang, Hsin-Min
Wang, Yih-Ru
Chen, Sin-Horng
[J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
[5] GMM-BASED ITERATIVE ENTROPY CODING FOR SPECTRAL ENVELOPES OF SPEECH AND AUDIO
Korse, Srikanth
Fuchs, Guillaume
Backstrom, Tom
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5689 - 5693
[6] GMM-Based Speaker Gender and Age Classification After Voice Conversion
Pribil, Jiri
Pribilova, Anna
Matousek, Jindrich
[J]. 2016 FIRST INTERNATIONAL WORKSHOP ON SENSING, PROCESSING AND LEARNING FOR INTELLIGENT MACHINES (SPLINE), 2016,
[7] Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion
Tanaka, Kei
Hara, Sunao
Abe, Masanobu
Minagi, Shogo
[J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
[8] Voice Conversion Using Bilinear Model Integrated with Joint GMM-based Classification
Sun, Xinjian
Zhang, Xiongwei
Yang, Jibin
Cao, Tieyong
[J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2013, : 1225 - 1228
[9] Multi-frame GMM-based block quantisation of line spectral frequencies
So, S
Paliwal, KK
[J]. SPEECH COMMUNICATION, 2005, 47 (03) : 265 - 276
[10] Modulation Spectrum-Based Post-Filter for GMM-Based Voice Conversion
Takamichi, Shinnosuke
Toda, Tomoki
Black, Alan W.
Nakamura, Satoshi
[J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,

← 1 2 3 4 5 →