Cross-Lingual Voice Conversion-Based Polyglot Speech Synthesizer for Indian Languages

被引:0
|
作者
Ramani, B. [1 ]
Jeeva, Actlin M. P. [1 ]
Vijayalakshmi, P. [1 ]
Nagarajan, T. [1 ]
机构
[1] SSN Coll Engn, Madras, Tamil Nadu, India
关键词
polyglot; GMM; cross-lingual voice conversion;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A polyglot speech synthesizer, synthesizes speech for any given monolingual or multilingual text, in a single speaker's voice. In this regard, a polyglot speech corpus is required. It is difficult to find a speaker proficient in multiple languages. Therefore, in the current work, by exploiting the acoustic similarity of phonemes across Indian languages, a polyglot speech corpus is obtained for four Indian languages and Indian English, using GMM-based cross-lingual voice conversion. The optimum target speaker and GMM topology is chosen based on the performance of a speaker identification system. It is observed that, the language that shares the most number of phonemes with the other languages, serves as the best target. A polyglot speech corpus derived in this target speaker's voice, is further used to develop an HMM-based polyglot speech synthesizer. The performance of this synthesizer is evaluated in terms of speaker identity using ABX listening test, quality using mean opinion score (MOS) and speaker switching using subjective listening test.
引用
收藏
页码:775 / 779
页数:5
相关论文
共 50 条
  • [21] Description-based Controllable Text-to-Speech with Cross-Lingual Voice Control
    Yamamoto, Ryuichi
    Shirahata, Yuma
    Kawamura, Masaya
    Tachibana, Kentaro
    [J]. arXiv,
  • [22] DISENTANGLED SPEECH REPRESENTATION LEARNING FOR ONE-SHOT CROSS-LINGUAL VOICE CONVERSION USING β-VAE
    Lu, Hui
    Wang, Disong
    Wu, Xixin
    Wu, Zhiyong
    Liu, Xunying
    Meng, Helen
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 814 - 821
  • [23] Exploring Cross-lingual Singing Voice Synthesis Using Speech Data
    Cao, Yuewen
    Liu, Songxiang
    Kang, Shiyin
    Hu, Na
    Liu, Peng
    Liu, Xunying
    Su, Dan
    Yu, Dong
    Meng, Helen
    [J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [24] CL-NERIL: A cross-lingual model for NER in Indian languages
    Prabhakar, Akshara
    Majumder, Gouri Sankar
    Anand, Ashish
    [J]. arXiv, 2021,
  • [25] A New HMM-Based Voice Conversion Methodology Evaluated on Monolingual and Cross-Lingual Conversion Tasks
    Percybrooks, Winston S.
    Moore, Elliot
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (12) : 2298 - 2310
  • [26] CROSS-LINGUAL VOICE CONVERSION WITH BILINGUAL PHONETIC POSTERIORGRAM AND AVERAGE MODELING
    Zhou, Yi
    Tian, Xiaohai
    Xu, Haihua
    Das, Rohan Kumar
    Li, Haizhou
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6790 - 6794
  • [27] Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation
    Zhou, Yi
    Tian, Xiaohai
    Wu, Zhizheng
    Li, Haizhou
    [J]. INTERSPEECH 2021, 2021, : 1374 - 1378
  • [28] Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion
    Zhao, Shengkui
    Nguyen, Trung Hieu
    Wang, Hao
    Ma, Bin
    [J]. INTERSPEECH 2020, 2020, : 2927 - 2931
  • [29] CROSS-LINGUAL SPEECH RECOGNITION BETWEEN LANGUAGES FROM THE SAME LANGUAGE FAMILY
    Zgank, Andrej
    [J]. PROCEEDINGS OF THE ROMANIAN ACADEMY SERIES A-MATHEMATICS PHYSICS TECHNICAL SCIENCES INFORMATION SCIENCE, 2019, 20 (02): : 184 - 191
  • [30] Domain Adaptation and Language Conditioning to Improve Phonetic Posteriorgram Based Cross-Lingual Voice Conversion
    Hsu, Pin-Chieh
    Minematsu, Nobuaki
    Saito, Daisuke
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 950 - 956