Cross-Lingual Voice Conversion-Based Polyglot Speech Synthesizer for Indian Languages

被引：0

作者：

Ramani, B. ^{[1
]}

Jeeva, Actlin M. P. ^{[1
]}

Vijayalakshmi, P. ^{[1
]}

Nagarajan, T. ^{[1
]}

机构：

[1] SSN Coll Engn, Madras, Tamil Nadu, India

来源：

15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年

关键词：

polyglot; GMM; cross-lingual voice conversion;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A polyglot speech synthesizer, synthesizes speech for any given monolingual or multilingual text, in a single speaker's voice. In this regard, a polyglot speech corpus is required. It is difficult to find a speaker proficient in multiple languages. Therefore, in the current work, by exploiting the acoustic similarity of phonemes across Indian languages, a polyglot speech corpus is obtained for four Indian languages and Indian English, using GMM-based cross-lingual voice conversion. The optimum target speaker and GMM topology is chosen based on the performance of a speaker identification system. It is observed that, the language that shares the most number of phonemes with the other languages, serves as the best target. A polyglot speech corpus derived in this target speaker's voice, is further used to develop an HMM-based polyglot speech synthesizer. The performance of this synthesizer is evaluated in terms of speaker identity using ABX listening test, quality using mean opinion score (MOS) and speaker switching using subjective listening test.

引用

页码：775 / 779

页数：5

共 50 条

[21] Description-based Controllable Text-to-Speech with Cross-Lingual Voice Control
Yamamoto, Ryuichi
Shirahata, Yuma
Kawamura, Masaya
Tachibana, Kentaro
[J]. arXiv,
[22] DISENTANGLED SPEECH REPRESENTATION LEARNING FOR ONE-SHOT CROSS-LINGUAL VOICE CONVERSION USING β-VAE
Lu, Hui
Wang, Disong
Wu, Xixin
Wu, Zhiyong
Liu, Xunying
Meng, Helen
[J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 814 - 821
[23] Exploring Cross-lingual Singing Voice Synthesis Using Speech Data
Cao, Yuewen
Liu, Songxiang
Kang, Shiyin
Hu, Na
Liu, Peng
Liu, Xunying
Su, Dan
Yu, Dong
Meng, Helen
[J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
[24] CL-NERIL: A cross-lingual model for NER in Indian languages
Prabhakar, Akshara
Majumder, Gouri Sankar
Anand, Ashish
[J]. arXiv, 2021,
[25] A New HMM-Based Voice Conversion Methodology Evaluated on Monolingual and Cross-Lingual Conversion Tasks
Percybrooks, Winston S.
Moore, Elliot
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (12) : 2298 - 2310
[26] CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING
Zhou, Yi
Tian, Xiaohai
Xu, Haihua
Das, Rohan Kumar
Li, Haizhou
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6790 - 6794
[27] Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation
Zhou, Yi
Tian, Xiaohai
Wu, Zhizheng
Li, Haizhou
[J]. INTERSPEECH 2021, 2021, : 1374 - 1378
[28] Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion
Zhao, Shengkui
Nguyen, Trung Hieu
Wang, Hao
Ma, Bin
[J]. INTERSPEECH 2020, 2020, : 2927 - 2931
[29] CROSS-LINGUAL SPEECH RECOGNITION BETWEEN LANGUAGES FROM THE SAME LANGUAGE FAMILY
Zgank, Andrej
[J]. PROCEEDINGS OF THE ROMANIAN ACADEMY SERIES A-MATHEMATICS PHYSICS TECHNICAL SCIENCES INFORMATION SCIENCE, 2019, 20 (02): : 184 - 191
[30] Domain Adaptation and Language Conditioning to Improve Phonetic Posteriorgram Based Cross-Lingual Voice Conversion
Hsu, Pin-Chieh
Minematsu, Nobuaki
Saito, Daisuke
[J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 950 - 956

← 1 2 3 4 5 →