Cross-Lingual Voice Conversion-Based Polyglot Speech Synthesizer for Indian Languages

被引:0
|
作者
Ramani, B. [1 ]
Jeeva, Actlin M. P. [1 ]
Vijayalakshmi, P. [1 ]
Nagarajan, T. [1 ]
机构
[1] SSN Coll Engn, Madras, Tamil Nadu, India
关键词
polyglot; GMM; cross-lingual voice conversion;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A polyglot speech synthesizer, synthesizes speech for any given monolingual or multilingual text, in a single speaker's voice. In this regard, a polyglot speech corpus is required. It is difficult to find a speaker proficient in multiple languages. Therefore, in the current work, by exploiting the acoustic similarity of phonemes across Indian languages, a polyglot speech corpus is obtained for four Indian languages and Indian English, using GMM-based cross-lingual voice conversion. The optimum target speaker and GMM topology is chosen based on the performance of a speaker identification system. It is observed that, the language that shares the most number of phonemes with the other languages, serves as the best target. A polyglot speech corpus derived in this target speaker's voice, is further used to develop an HMM-based polyglot speech synthesizer. The performance of this synthesizer is evaluated in terms of speaker identity using ABX listening test, quality using mean opinion score (MOS) and speaker switching using subjective listening test.
引用
收藏
页码:775 / 779
页数:5
相关论文
共 50 条
  • [1] Voice Conversion-Based Multilingual to Polyglot Speech Synthesizer for Indian Languages
    Ramani, B.
    Jeeva, Actlin M. P.
    Vijayalakshmi, P.
    Nagarajan, T.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
  • [2] A Multilingual to Polyglot Speech Synthesizer for Indian Languages Using a Voice-Converted Polyglot Speech Corpus
    Vijayalakshmi, P.
    Ramani, B.
    Jeeva, M. P. Actlin
    Nagarajan, T.
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2018, 37 (05) : 2142 - 2163
  • [3] A Multilingual to Polyglot Speech Synthesizer for Indian Languages Using a Voice-Converted Polyglot Speech Corpus
    P. Vijayalakshmi
    B. Ramani
    M. P. Actlin Jeeva
    T. Nagarajan
    [J]. Circuits, Systems, and Signal Processing, 2018, 37 : 2142 - 2163
  • [4] An Approach to Cross-Lingual Voice Conversion
    Rallabandi, Sai Sirisha
    Gangashetty, Suryakanth V.
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [5] CROSS-LINGUAL FRAME SELECTION METHOD FOR POLYGLOT SPEECH SYNTHESIS
    Chen, Chia-Ping
    Huang, Yi-Chin
    Wu, Chung-Hsien
    Lee, Kuan-De
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4521 - 4524
  • [6] Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN
    Du, Zongyang
    Zhou, Kun
    Sisman, Barrak
    Li, Haizhou
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 507 - 513
  • [7] Cross-Lingual Sentiment Analysis for Indian Regional Languages
    Impana, P.
    Kallimani, Jagadish S.
    [J]. 2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2017, : 867 - 872
  • [8] Cross-Lingual Information Retrieval System for Indian Languages
    Jagarlamudi, Jagadeesh
    Kumaran, A.
    [J]. ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 80 - 87
  • [9] Frame Alignment Method for Cross-lingual Voice Conversion
    Erro, Daniel
    Moreno, Asuncion
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1533 - 1536
  • [10] Polyglot Speech Synthesis Based on Cross-Lingual Frame Selection Using Auditory and Articulatory Features
    Chen, Chia-Ping
    Huang, Yi-Chin
    Wu, Chung-Hsien
    Lee, Kuan-De
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1558 - 1570