Multi-Voice Singing Synthesis From Lyrics

被引:2
|
作者
Resna, S. [1 ]
Rajan, Rajeev [2 ]
机构
[1] Tata Elxsi, MultiMedia & Commun Vert, Technopk, Thiruvananthapuram, Kerala, India
[2] APJ Abdul Kalam Technol Univ, Dept Elect & Commun Engn, Coll Engn, Thiruvananthapuram, Kerala, India
关键词
Multi-speaker; Text-to-singing conversion; Singing voice synthesis; Phonetic quality;
D O I
10.1007/s00034-022-02122-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a multi-voice singing synthesis framework is proposed to convert lyrics to their sung version in the target speaker's voice. It consists of three blocks: a text-to-speech (TTS) module, a speech-to-singing (STS) module, and an intelligibility enhancement module. Synthesized speech is generated from lyrics for a target speaker's voice by a TTS converter in the front end. Later, a sung version is synthesized in target melody through an encoder-decoder model in the STS module. Further, phonetic intelligibility is enhanced using an intelligibility enhancement module based on an audio style transfer scheme. The proposed system is systematically evaluated using LibriSpeech and NUS-48E corpus using subjective and objective evaluation. We have compared our model with a state-of-the-art multi-voice singing synthesis model based on a generative adversarial network (GAN). Our study shows that the proposed model performs on par with the baseline model without any phoneme annotations.
引用
收藏
页码:307 / 321
页数:15
相关论文
共 50 条
  • [1] Multi-Voice Singing Synthesis From Lyrics
    S. Resna
    Rajeev Rajan
    Circuits, Systems, and Signal Processing, 2023, 42 : 307 - 321
  • [2] Word Intelligibility in Multi-voice Singing: The Influence of Chorus Size
    Condit-Schultz, Nathaniel
    Huron, David
    JOURNAL OF VOICE, 2017, 31 (01) : 121.e1 - 121.e8
  • [3] WGANSing: A Multi-Voice Singing Voice Synthesizer Based on the Wasserstein-GAN
    Chandna, Pritish
    Blaauw, Merlijn
    Bonada, Jordi
    Gomez, Emilia
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [4] A Lyrics to Singing Voice Synthesis system with variable timbre
    Li, Jinlong
    Yang, Hongwu
    Zhang, Weizhao
    Cai, Lianhong
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL II, 2010, : 109 - 112
  • [5] Hymnos - A network for the study of multi-voice singing between orality and writing
    Ginesi, Gianni
    TRANS-REVISTA TRANSCULTURAL DE MUSICA, 2012, 16
  • [6] A Lyrics to Singing Voice Synthesis System with Variable Timbre
    Li, Jinlong
    Yang, Hongwu
    Zhang, Weizhao
    Cai, Lianhong
    APPLIED INFORMATICS AND COMMUNICATION, PT 2, 2011, 225 : 186 - +
  • [7] Lyrics recognition from singing voice focused on correspondence between voice and notes
    Suzuki, Motoyuki
    Tomita, Sho
    Morita, Tomoki
    INTERSPEECH 2019, 2019, : 3238 - 3241
  • [8] Rhythm Speech Lyrics Input for MIDI-Based Singing Voice Synthesis
    Lee, Hong-Ru
    Huang, Chih-Fang
    Hsu, Chih-Hao
    Wang, Wen-Nan
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2009, 2009, 5879 : 459 - +
  • [9] Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information
    Motoyuki Suzuki
    Toru Hosoya
    Akinori Ito
    Shozo Makino
    EURASIP Journal on Advances in Signal Processing, 2007
  • [10] Music information retrieval from a singing voice using lyrics and melody information
    Suzuki, Motoyuki
    Hosoya, Toru
    Ito, Akinori
    Makino, Shozo
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2007, 2007 (1)