Voice Conversion without Parallel Speech Corpus Based on Mixtures of Linear Transform

被引：1

作者：

Jian, Zhi-Hua ^{[1
]}

Yang, Zhen ^{[1
]}

机构：

[1] Nanjing Univ Post & Telecommun, Inst Signal Proc & Transmiss, Nanjing, Peoples R China

来源：

2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15 | 2007年

关键词：

Voice conversion; multimedia application; Ms-LT; EM algorithm;

D O I：

10.1109/WICOM.2007.701

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This paper presents an algorithm for voice conversion based on mixtures of linear transform (Ms-LT) which avoids the need for parallel training data inherent in conventional approaches. In maximum likelihood framework, the EM algorithm is used to compute the parameters of the conversion function. And the chirp z-transform is utilized to enhance the averaged spectral envelop due to the linear weighting. The proposed voice conversion system is evaluated using both objective and subjective measures. The experimental results demonstrate that our approach is capable of effectively transforming speaker identity and can achieve comparable results of the conventional methods where a parallel corpus exists.

引用

页码：2825 / 2828

页数：4

共 50 条

[1] A novel method for voice conversion based on non-parallel corpus
Sayadian A.
Mozaffari F.
International Journal of Speech Technology, 2017, 20 (3) : 587 - 592
[2] GAZEV: GAN-Based Zero-Shot Voice Conversion over Non-parallel Speech Corpus
Zhang, Zining
He, Bingsheng
Zhang, Zhenjie
INTERSPEECH 2020, 2020, : 791 - 795
[3] Russian Speech Conversion Algorithm Based on a Parallel Corpus and Machine Translation
Zhang, Yingyi
WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
[4] Deep Neural Network based Voice Conversion with A Large Synthesized Parallel Corpus
Wen, Zhengqi
Li, Kehuang
Tao, Jianhua
Lee, Chin-Hui
2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
[5] Parallel-data-free Many-to-many Voice Conversion based on DNN Integrated with Eigenspace Using a Non-parallel Speech Corpus
Hashimoto, Tetsuya
Uchida, Hidetsugu
Saito, Daisuke
Minematsu, Nobuaki
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1278 - 1282
[6] Voice Conversion Based on Unified Dictionary with Clustered Features Between Non-parallel Corpus
Jin, Hui
Yu, Yi-Biao
2018 4TH ANNUAL INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC 2018), 2018, : 229 - 232
[7] Parallel vs. Non-parallel Voice Conversion for Esophageal Speech
Serrano, Luis
Raman, Sneha
Tavarez, David
Navas, Eva
Hernaez, Inma
INTERSPEECH 2019, 2019, : 4549 - 4553
[8] Many-to-many voice conversion experiments using a Korean speech corpus
Yook, Dongsuk
Seo, HyungJin
Ko, Bonggu
Yoo, In-Chul
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2022, 41 (03): : 351 - 358
[9] Voice Conversion Based on Mixtures of Factor Analyzers
Uto, Yosuke
Nankaku, Yoshihiko
Toda, Tomoki
Lee, Akinobu
Tokuda, Keiichi
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2278 - +
[10] Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data
Park, Seung-won
Kim, Doo-young
Joe, Myun-chul
INTERSPEECH 2020, 2020, : 4696 - 4700

← 1 2 3 4 5 →