Cross-Lingual Voice Conversion-Based Polyglot Speech Synthesizer for Indian Languages

被引：0

作者：

Ramani, B. ^{[1
]}

Jeeva, Actlin M. P. ^{[1
]}

Vijayalakshmi, P. ^{[1
]}

Nagarajan, T. ^{[1
]}

机构：

[1] SSN Coll Engn, Madras, Tamil Nadu, India

来源：

15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年

关键词：

polyglot; GMM; cross-lingual voice conversion;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A polyglot speech synthesizer, synthesizes speech for any given monolingual or multilingual text, in a single speaker's voice. In this regard, a polyglot speech corpus is required. It is difficult to find a speaker proficient in multiple languages. Therefore, in the current work, by exploiting the acoustic similarity of phonemes across Indian languages, a polyglot speech corpus is obtained for four Indian languages and Indian English, using GMM-based cross-lingual voice conversion. The optimum target speaker and GMM topology is chosen based on the performance of a speaker identification system. It is observed that, the language that shares the most number of phonemes with the other languages, serves as the best target. A polyglot speech corpus derived in this target speaker's voice, is further used to develop an HMM-based polyglot speech synthesizer. The performance of this synthesizer is evaluated in terms of speaker identity using ABX listening test, quality using mean opinion score (MOS) and speaker switching using subjective listening test.

引用

页码：775 / 779

页数：5

共 50 条

[1] Voice Conversion-Based Multilingual to Polyglot Speech Synthesizer for Indian Languages
Ramani, B.
Jeeva, Actlin M. P.
Vijayalakshmi, P.
Nagarajan, T.
[J]. 2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
[2] A Multilingual to Polyglot Speech Synthesizer for Indian Languages Using a Voice-Converted Polyglot Speech Corpus
Vijayalakshmi, P.
Ramani, B.
Jeeva, M. P. Actlin
Nagarajan, T.
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2018, 37 (05) : 2142 - 2163
[3] A Multilingual to Polyglot Speech Synthesizer for Indian Languages Using a Voice-Converted Polyglot Speech Corpus
P. Vijayalakshmi
B. Ramani
M. P. Actlin Jeeva
T. Nagarajan
[J]. Circuits, Systems, and Signal Processing, 2018, 37 : 2142 - 2163
[4] An Approach to Cross-Lingual Voice Conversion
Rallabandi, Sai Sirisha
Gangashetty, Suryakanth V.
[J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[5] CROSS-LINGUAL FRAME SELECTION METHOD FOR POLYGLOT SPEECH SYNTHESIS
Chen, Chia-Ping
Huang, Yi-Chin
Wu, Chung-Hsien
Lee, Kuan-De
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4521 - 4524
[6] Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN
Du, Zongyang
Zhou, Kun
Sisman, Barrak
Li, Haizhou
[J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 507 - 513
[7] Cross-Lingual Sentiment Analysis for Indian Regional Languages
Impana, P.
Kallimani, Jagadish S.
[J]. 2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2017, : 867 - 872
[8] Cross-Lingual Information Retrieval System for Indian Languages
Jagarlamudi, Jagadeesh
Kumaran, A.
[J]. ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 80 - 87
[9] Frame Alignment Method for Cross-lingual Voice Conversion
Erro, Daniel
Moreno, Asuncion
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1533 - 1536
[10] Polyglot Speech Synthesis Based on Cross-Lingual Frame Selection Using Auditory and Articulatory Features
Chen, Chia-Ping
Huang, Yi-Chin
Wu, Chung-Hsien
Lee, Kuan-De
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1558 - 1570

← 1 2 3 4 5 →