Multi-Voice Singing Synthesis From Lyrics

被引:2
|
作者
Resna, S. [1 ]
Rajan, Rajeev [2 ]
机构
[1] Tata Elxsi, MultiMedia & Commun Vert, Technopk, Thiruvananthapuram, Kerala, India
[2] APJ Abdul Kalam Technol Univ, Dept Elect & Commun Engn, Coll Engn, Thiruvananthapuram, Kerala, India
关键词
Multi-speaker; Text-to-singing conversion; Singing voice synthesis; Phonetic quality;
D O I
10.1007/s00034-022-02122-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a multi-voice singing synthesis framework is proposed to convert lyrics to their sung version in the target speaker's voice. It consists of three blocks: a text-to-speech (TTS) module, a speech-to-singing (STS) module, and an intelligibility enhancement module. Synthesized speech is generated from lyrics for a target speaker's voice by a TTS converter in the front end. Later, a sung version is synthesized in target melody through an encoder-decoder model in the STS module. Further, phonetic intelligibility is enhanced using an intelligibility enhancement module based on an audio style transfer scheme. The proposed system is systematically evaluated using LibriSpeech and NUS-48E corpus using subjective and objective evaluation. We have compared our model with a state-of-the-art multi-voice singing synthesis model based on a generative adversarial network (GAN). Our study shows that the proposed model performs on par with the baseline model without any phoneme annotations.
引用
收藏
页码:307 / 321
页数:15
相关论文
共 50 条
  • [41] LEARN2SING: TARGET SPEAKER SINGING VOICE SYNTHESIS BY LEARNING FROM A SINGING TEACHER
    Xue, Heyang
    Yang, Shan
    Lei, Yi
    Xie, Lei
    Li, Xiulin
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 522 - 529
  • [42] A framework for parametric singing voice analysis/synthesis
    Kim, YE
    2003 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS PROCEEDINGS, 2003, : 123 - 126
  • [43] Singing Voice Synthesis System for Carnatic Music
    Rajan, Ragesh M.
    2018 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2018, : 831 - 835
  • [44] SYNODALITY IN THE LIFE AND MISSION OF THE CHURCH: MULTI-VOICE COMMENTARY ON THE DOCUMENT OF THE INTERNATIONAL THEOLOGICAL COMMISSION
    Czudek, Roman
    STUDIA THEOLOGICA-CZECH REPUBLIC, 2021, 23 (04): : 147 - 150
  • [45] Singing voice outcomes following singing voice therapy
    Dastolfo-Hromack, Christina
    Thomas, Tracey L.
    Rosen, Clark A.
    Gartner-Schmidt, Jackie
    LARYNGOSCOPE, 2016, 126 (11): : 2546 - 2551
  • [46] Optimization Algorithm in Computer-Aided Multi-Voice Music Arrangement and Collaborative Design
    Tian B.
    Tian J.
    Computer-Aided Design and Applications, 2024, 21 (S26): : 172 - 186
  • [47] Singing in the brain:: Independence of lyrics and tunes
    Besson, M
    Faïta, F
    Peretz, I
    Bonnel, AM
    Requin, J
    PSYCHOLOGICAL SCIENCE, 1998, 9 (06) : 494 - 498
  • [48] The voice and singing
    Osborne, Conrad L.
    OPERA NEWS, 2006, 71 (04): : 78 - 78
  • [49] The singing voice
    Garcia-Lopez, Isabel
    Gavilan Bouzas, Javier
    ACTA OTORRINOLARINGOLOGICA ESPANOLA, 2010, 61 (06): : 441 - 451
  • [50] AUTOMATIC DETECTION OF MISPRONOUNCED LYRICS IN SINGING
    Tsai, Wei-Ho
    Tran, Van-Thuan
    Kung, Shiang-Shiun
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2019, : 561 - 565