An Approach to Cross-Lingual Voice Conversion

被引:0
|
作者
Rallabandi, Sai Sirisha [1 ]
Gangashetty, Suryakanth V. [1 ]
机构
[1] Int Inst Informat Technol, Speech Proc Lab, Hyderabad, India
关键词
Deep Neural Networks; Cross-Lingual Voice Conversion; Scaled Exponential Linear Units; Mel Generalised Cepstral Coefficients; Auto-encoded speech;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The most prevalent multilingual Text-to-Speech (TTS) synthesis systems encounter an unnatural speaker shift at the language boundaries. This is observed when they are employed for code-mixed TTS synthesis. For the very fact that the collection of polyglot speech is non-trivial, many alternative approaches have been in focus. Cross-Lingual Voice Conversion (CLVC) has been one of those to generate speech with desired speaker and language identities. Our aim in this paper is to design a light-weighted CLVC framework between a pair of Mandarin-English speakers. CLVC is challenging when compared to traditional Voice Conversion (VC) because of its nature of accommodating unaligned corpus from the source and target speakers. We thus focus on generating a parallel corpus for CLVC and bridging the gap between speakers and languages. We perform a text-independent voice conversion with a three-layered conventional Neural Network (NN) for this purpose. The main contributions include i) Source similarity in both training and conversion stages of CLVC, ii) generation of a parallel corpus and iii) text independent and transcription free CLVC. We exploit two variants of a Neural Network in the proposed framework, i) an autoencoder to enable the source similarity and generation of parallel corpus, ii) a traditional DNN for feature mapping between the source and target. The subjective and objective evaluations show that the proposed method is indeed capable of performing a CLVC with an auto-encoded speech.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Cross-Lingual Voice Conversion With Controllable Speaker Individuality Using Variational Autoencoder and Star Generative Adversarial Network
    Ho, Tuan Vu
    Akagi, Masato
    [J]. IEEE ACCESS, 2021, 9 : 47503 - 47515
  • [32] Reinforced Transformer with Cross-Lingual Distillation for Cross-Lingual Aspect Sentiment Classification
    Wu, Hanqian
    Wang, Zhike
    Qing, Feng
    Li, Shoushan
    [J]. ELECTRONICS, 2021, 10 (03) : 1 - 14
  • [33] AA SPECTRAL SPACE WARPING APPROACH TO CROSS-LINGUAL VOICE TRANSFORMATION IN HMM-BASED TTS
    Wang, Hao
    Soong, Frank
    Meng, Helen
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4874 - 4878
  • [34] A KL DIVERGENCE AND DNN APPROACH TO CROSS-LINGUAL TTS
    Xie, Feng-Long
    Soong, Frank K.
    Li, Haifeng
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5515 - 5519
  • [35] A Cross-Lingual Approach for Building Multilingual Sentiment Lexicons
    Naderalvojoud, Behzad
    Qasemizadeh, Behrang
    Kallmeyer, Laura
    Sezer, Ebru Akcapinar
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2018), 2018, 11107 : 259 - 266
  • [36] Cross-Lingual Blog Analysis by Cross-Lingual Comparison of Characteristic Terms and Blog Posts
    Nakasaki, Hiroyuki
    Kawaba, Mariko
    Utsuro, Takehito
    Fukuhara, Tomohiro
    Nakagawa, Hiroshi
    Kando, Noriko
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION, 2008, : 105 - +
  • [37] SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism
    Fatima, Mehwish
    Kolber, Tim
    Markert, Katja
    Strube, Michael
    [J]. NewSumm 2023 - Proceedings of the 4th New Frontiers in Summarization Workshop, Proceedings of EMNLP Workshop, 2023, : 24 - 40
  • [38] Cross-lingual Emotion Detection
    Hassan, Sabit
    Shaar, Shaden
    Darwish, Kareem
    [J]. 2022 Language Resources and Evaluation Conference, LREC 2022, 2022, : 6948 - 6958
  • [39] Cross-lingual timeline summarization
    Cagliero, Luca
    La Quatra, Moreno
    Garza, Paolo
    Baralis, Elena
    [J]. 2021 IEEE FOURTH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE 2021), 2021, : 45 - 53
  • [40] Cross-Lingual Word Embeddings
    Søgaard A.
    Vulić I.
    Ruder S.
    Faruqui M.
    [J]. Synthesis Lectures on Human Language Technologies, 2019, 12 (02): : 1 - 132