An Approach to Cross-Lingual Voice Conversion

被引：0

作者：

Rallabandi, Sai Sirisha ^{[1
]}

Gangashetty, Suryakanth V. ^{[1
]}

机构：

[1] Int Inst Informat Technol, Speech Proc Lab, Hyderabad, India

来源：

2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2019年

关键词：

Deep Neural Networks; Cross-Lingual Voice Conversion; Scaled Exponential Linear Units; Mel Generalised Cepstral Coefficients; Auto-encoded speech;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The most prevalent multilingual Text-to-Speech (TTS) synthesis systems encounter an unnatural speaker shift at the language boundaries. This is observed when they are employed for code-mixed TTS synthesis. For the very fact that the collection of polyglot speech is non-trivial, many alternative approaches have been in focus. Cross-Lingual Voice Conversion (CLVC) has been one of those to generate speech with desired speaker and language identities. Our aim in this paper is to design a light-weighted CLVC framework between a pair of Mandarin-English speakers. CLVC is challenging when compared to traditional Voice Conversion (VC) because of its nature of accommodating unaligned corpus from the source and target speakers. We thus focus on generating a parallel corpus for CLVC and bridging the gap between speakers and languages. We perform a text-independent voice conversion with a three-layered conventional Neural Network (NN) for this purpose. The main contributions include i) Source similarity in both training and conversion stages of CLVC, ii) generation of a parallel corpus and iii) text independent and transcription free CLVC. We exploit two variants of a Neural Network in the proposed framework, i) an autoencoder to enable the source similarity and generation of parallel corpus, ii) a traditional DNN for feature mapping between the source and target. The subjective and objective evaluations show that the proposed method is indeed capable of performing a CLVC with an auto-encoded speech.

引用

页数：7

共 50 条

[1] Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN
Du, Zongyang
Zhou, Kun
Sisman, Barrak
Li, Haizhou
[J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 507 - 513
[2] Frame Alignment Method for Cross-lingual Voice Conversion
Erro, Daniel
Moreno, Asuncion
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1533 - 1536
[3] ON THE STUDY OF GENERATIVE ADVERSARIAL NETWORKS FOR CROSS-LINGUAL VOICE CONVERSION
Sisman, Berrak
Zhang, Mingyang
Dong, Minghui
Li, Haizhou
[J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 144 - 151
[4] RefXVC: Cross-Lingual Voice Conversion With Enhanced Reference Leveraging
Zhang, Mingyang
Zhou, Yi
Ren, Yi
Zhang, Chen
Yin, Xiang
Li, Haizhou
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4146 - 4156
[5] Cross-lingual Voice Conversion with Disentangled Universal Linguistic Representations
Yang, Zhenchuan
Zhang, Weibin
Liu, Yufei
Xing, Xiaofen
[J]. INTERSPEECH 2021, 2021, : 1604 - 1608
[6] CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING
Zhou, Yi
Tian, Xiaohai
Xu, Haihua
Das, Rohan Kumar
Li, Haizhou
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6790 - 6794
[7] Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation
Zhou, Yi
Tian, Xiaohai
Wu, Zhizheng
Li, Haizhou
[J]. INTERSPEECH 2021, 2021, : 1374 - 1378
[8] Multi-Task WaveRNN With an Integrated Architecture for Cross-Lingual Voice Conversion
Zhou, Yi
Tian, Xiaohai
Li, Haizhou
[J]. IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 1310 - 1314
[9] DNN-Based Cross-Lingual Voice Conversion Using Bottleneck Features
M. Kiran Reddy
K. Sreenivasa Rao
[J]. Neural Processing Letters, 2020, 51 : 2029 - 2042
[10] DNN-Based Cross-Lingual Voice Conversion Using Bottleneck Features
Reddy, M. Kiran
Rao, K. Sreenivasa
[J]. NEURAL PROCESSING LETTERS, 2020, 51 (02) : 2029 - 2042

← 1 2 3 4 5 →