CROSS-LINGUAL TEXT-TO-SPEECH VIA HIERARCHICAL STYLE TRANSFER

被引:0
|
作者
Lee, Sang-Hoon [1 ]
Choi, Ha-Yeong [1 ]
Lee, Seong-Whan [1 ]
机构
[1] Korea Univ, Dept Artificial Intelligence, Seoul, South Korea
关键词
Cross-lingual TTS; Multi-lingual TTS;
D O I
10.1109/ICASSPW62465.2024.10627450
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents LIMITLESS, a cross-lingual text-to-speech via hierarchical style transfer that can transfer the prosody and voice style, respectively. Building upon HierSpeech++, we utilize the 2-stage hierarchical speech synthesis frameworks with text-to-vector (TTV) and vector-to-speech. We simply modify the TTV by adding the language embedding of each language on the text representation and use the hierarchical speech synthesizer without modification. We train the TTV model with 7 languages and 14 speakers from the Indic languages dataset which was released for LIMMITS 2024 and fine-tuned the TTV model with target speakers for Track 1 and 2. The results show that our framework can transfer voice style robustly in terms of speaker similarity.
引用
收藏
页码:25 / 26
页数:2
相关论文
共 50 条
  • [41] Enhancing Cross-lingual Transfer via Phonemic Transcription Integration
    Nguyen, Hoang H.
    Zhang, Chenwei
    Zhang, Tao
    Rohrbaugh, Eugene
    Yu, Philip S.
    arXiv, 2023,
  • [42] Cross-lingual Dialog Model for Speech to Speech Translation
    Ettelaie, Emil
    Georgiou, Panayiotis G.
    Narayanan, Shrikanth
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1173 - 1176
  • [43] Enhancing Cross-lingual Transfer via Phonemic Transcription Integration
    Nguyen, Hoang H.
    Zhang, Chenwei
    Zhang, Tao
    Rohrbaugh, Eugene
    Yu, Philip S.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 9163 - 9175
  • [44] Enhancing Cross-lingual Transfer via Phonemic Transcription Integration
    Nguyen, Hoang H.
    Zhang, Chenwei
    Zhang, Tao
    Rohrbaugh, Eugene
    Yu, Philip S.
    Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2023, : 9163 - 9175
  • [45] Cross-lingual font style transfer with full-domain convolutional attention
    Zhao, Hui-huang
    Ji, Tian-le
    Rosin, Paul L.
    Lai, Yu-Kun
    Meng, Wei-liang
    Wang, Yao-nan
    PATTERN RECOGNITION, 2024, 155
  • [46] Exploring Cross-lingual Textual Style Transfer with Large Multilingual Language Models
    Moskovskiy, Daniil
    Dementieva, Daryna
    Panchenko, Alexander
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): STUDENT RESEARCH WORKSHOP, 2022, : 346 - 354
  • [47] On cross-lingual retrieval with multilingual text encoders
    Litschko, Robert
    Vulic, Ivan
    Ponzetto, Simone Paolo
    Glavas, Goran
    INFORMATION RETRIEVAL JOURNAL, 2022, 25 (02): : 149 - 183
  • [48] Expressive Text-to-Speech using Style Tag
    Kim, Minchan
    Cheon, Sung Jun
    Choi, Byoung Jin
    Kim, Jong Jin
    Kim, Nam Soo
    INTERSPEECH 2021, 2021, : 4663 - 4667
  • [49] Cross-lingual Text Clustering in a Large System
    Schneider, Nicole R.
    Sankaranarayanan, Jagan
    Samet, Hanan
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2023, 2023, : 1 - 11
  • [50] On cross-lingual retrieval with multilingual text encoders
    Robert Litschko
    Ivan Vulić
    Simone Paolo Ponzetto
    Goran Glavaš
    Information Retrieval Journal, 2022, 25 : 149 - 183