VOICE CONVERSION BASED ON MATRIX VARIATE GAUSSIAN MIXTURE MODEL

被引：0

作者：

Saito, Daisuke ^{[1
]}

Doi, Hidenobu ^{[1
]}

Minematsu, Nobuaki ^{[1
]}

Hirose, Keikichi ^{[1
]}

机构：

[1] Univ Tokyo, Tokyo, Japan

来源：

2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) | 2014年

关键词：

Voice conversion; Gaussian mixture model; matrix variate distribution; matrix variate normal; matrix variate Gaussian mixture model; SPEECH RECOGNITION;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper describes a novel approach to construct a mapping function between a given speaker pair using probability density functions (PDF) of matrix variate. In voice conversion studies, two important functions should be realized: 1) precise modeling of both the source and target feature spaces, and 2) construction of a proper transform function between these spaces. Voice conversion based on Gaussian mixture model (GMM) is widely used because of their flexibility and easiness in handling. In GMM-based approaches, a joint vector space of the source and target is first constructed, and the joint PDF of the two vectors is modeled as GMM in the joint vector space. The joint vector approach mainly focuses on precise modeling of the 'joint' feature space, and does not always construct a proper transform between two feature spaces. In contrast, the proposed method constructs the joint PDF as GMM in a matrix variate space whose row and column respectively correspond to the two functions, and it has potential to precisely model both the characteristics of the feature spaces and the relation between the source and target spaces. Experimental results show that the proposed method contributes to improve the performance of voice conversion.

引用

页码：567 / 571

页数：5

共 50 条

[1] Voice conversion based on matrix variate Gaussian mixture model using multiple frame features
Yang, Yi
Uchida, Hidetsugu
Saito, Daisuke
Minematsu, Nobuaki
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 302 - 306
[2] Voice conversion using Viterbi algorithm based on Gaussian mixture model
Jian Zhi-Hua
Yang Zhen
[J]. 2007 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2007, : 40 - 43
[3] Voice Conversion Using Structrued Gaussian Mixture Model
Zeng, Daojian
Yu, Yibiao
[J]. 2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 541 - 544
[4] Efficient Gaussian Mixture Model Evaluation in Voice Conversion
Tian, Jilei
Nurminen, Jani
Popa, Victor
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2282 - 2285
[5] Voice conversion using canonical correlation analysis based on Gaussian mixture model
Jian, ZhiHua
Yang, Zhen
[J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 1, PROCEEDINGS, 2007, : 210 - +
[6] Voice conversion algorithm using phoneme Gaussian mixture model
Sheng, L
Yin, JX
Huang, JC
[J]. PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2004, : 5 - 8
[7] Contribution on Gaussian Mixture Model Order Determination for Voice Conversion
Ben Amara, Ahmed
Ben Jebara, Sofia
[J]. 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL, IMAGE, VIDEO AND COMMUNICATIONS (ISIVC 2018), 2018, : 87 - 92
[8] A Voice Conversion System Based on the Harmonic plus Noise Excitation and Gaussian Mixture Model
Wu Lifang
Zhang Linghua
[J]. PROCEEDINGS OF THE 2012 SECOND INTERNATIONAL CONFERENCE ON INSTRUMENTATION & MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2012), 2012, : 1575 - 1578
[9] Mandarin Electrolaryngeal Voice Conversion with Combination of Gaussian Mixture Model and Non-negative Matrix Factorization
Li, Ming
Wang, Luting
Xu, Zhicheng
Cai, Danwei
[J]. 2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1360 - 1363
[10] Voice conversion using structured Gaussian mixture model in cepstrum eigenspace
LI Yangchun
YU Yibiao
[J]. Chinese Journal of Acoustics, 2015, 34 (03) : 325 - 336

← 1 2 3 4 5 →