SINGAN: Singing Voice Conversion with Generative Adversarial Networks

被引:0
|
作者
Sisman, Berrak [1 ,2 ]
Vijayan, Karthika [1 ]
Dong, Minghui [2 ]
Li, Haizhou [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] ASTAR, Inst Infocomm Res, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
Singing voice conversion; generative adversarial networks; singing voice;
D O I
10.1109/apsipaasc47483.2019.9023162
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Singing voice conversion (SVC) is a task to convert the source singer's voice to sound like that of the target singer, without changing the lyrical content. So far, most of the voice conversion studies mainly focus only on the speech voice conversion that is different from singing voice conversion. We note that singing conveys both lexical and emotional information through words and tones. It is one of the most expressive components in music and a means of entertainment as well as self expression. In this paper, we propose a novel singing voice conversion framework, that is based on Generative Adversarial Networks (GANs). The proposed CAN-based conversion framework, that we call SINGAN, consists of two neural networks: a discriminator to distinguish natural and converted singing voice, and a generator to deceive the discriminator. With CAN, we minimize the differences of the distributions between the original target parameters and the generated singing parameters. To our best knowledge, this is the first framework that uses generative adversarial networks for singing voice conversion. In experiments, we show that the proposed method effectively converts singing voices and outperforms the baseline approach.
引用
收藏
页码:112 / 118
页数:7
相关论文
共 50 条
  • [1] SINGING VOICE SYNTHESIS BASED ON GENERATIVE ADVERSARIAL NETWORKS
    Hono, Yukiya
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6955 - 6959
  • [2] Crossfire Conditional Generative Adversarial Networks for Singing Voice Extraction
    Yuan, Weitao
    Wang, Shengbei
    Li, Xiangrui
    Unoki, Masashi
    Wang, Wenwu
    [J]. INTERSPEECH 2021, 2021, : 3041 - 3045
  • [3] NVCGAN: Leveraging Generative Adversarial Networks for Robust Voice Conversion
    Zhang, Guoyu
    Liu, Jingrui
    Bi, Wenhao
    Dongye, Guangcheng
    Zhang, Li
    Jing, Ming
    Yu, Jiguo
    [J]. ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT II, ICIC 2024, 2024, 14876 : 330 - 342
  • [4] Non-parallel Voice Conversion using Generative Adversarial Networks
    Hasunuma, Yuta
    Hirayama, Chiaki
    Kobayashi, Masayuki
    Nagao, Tomoharu
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 1635 - 1640
  • [5] ON THE STUDY OF GENERATIVE ADVERSARIAL NETWORKS FOR CROSS-LINGUAL VOICE CONVERSION
    Sisman, Berrak
    Zhang, Mingyang
    Dong, Minghui
    Li, Haizhou
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 144 - 151
  • [6] Nonparallel Voice Conversion With Augmented Classifier Star Generative Adversarial Networks
    Kameoka, Hirokazu
    Kaneko, Takuhiro
    Tanaka, Kou
    Hojo, Nobukatsu
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2982 - 2995
  • [7] SVSGAN: SINGING VOICE SEPARATION VIA GENERATIVE ADVERSARIAL NETWORK
    Fan, Zhe-Cheng
    Lai, Yen-Lin
    Jang, Jyh-Shing R.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 726 - 730
  • [8] Non-parallel Voice Conversion using Weighted Generative Adversarial Networks
    Paul, Dipjyoti
    Pantazis, Yannis
    Stylianou, Yannis
    [J]. INTERSPEECH 2019, 2019, : 659 - 663
  • [9] VOICE IMPERSONATION USING GENERATIVE ADVERSARIAL NETWORKS
    Gao, Yang
    Singh, Rita
    Raj, Bhiksha
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2506 - 2510
  • [10] Novel Adaptive Generative Adversarial Network for Voice Conversion
    Patel, Maitreya
    Parmar, Mihir
    Doshi, Savan
    Shah, Nirmesh J.
    Patil, Hemant A.
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1273 - 1281