Parallel data free singing voice conversion with cycle-consistent BEGAN

被引:1
|
作者
Yousuf, Assila [1 ]
George, David Solomon [2 ]
机构
[1] Rajiv Gandhi Inst Technol, Dept Elect & Commun Engn, Kottayam, Kerala, India
[2] Govt Engn Coll, Dept Elect & Commun Engn, Idukki, Kerala, India
关键词
Singing voice conversion; GAN; Gated CNN; CycleGAN; BEGAN;
D O I
10.1016/j.matpr.2022.01.169
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Singing voice conversion (SVC) is the method to modify the timbre of the source singer with the target singer while retaining the linguistic content. Recent studies are mainly focused with non-parallel training data, since it is difficult to obtain parallel training data in real life applications. In this paper, a parallel data-free SVC technique is proposed using Cycle-consistent Boundary Equilibrium Generative Adversarial Networks (CycleBEGAN) with gated convolutional neural networks (CNNs) and an identitymapping loss. CycleBEGAN allows the learning of data distribution using both adversarial loss and cycle-consistency loss. Gated CNN and identity mapping loss ensures the sequential and hierarchical structures of information and preservation of linguistic information. This technique produces high quality converted singing voice without any time-alignment procedures and requires only a small amount of training data. Copyright (C) 2022 Elsevier Ltd. All rights reserved. Selection and peer-review under responsibility of the scientific committee of the International Conference on Artificial Intelligence & Energy Systems.
引用
收藏
页码:157 / 161
页数:5
相关论文
共 50 条
  • [1] SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy
    Guo, Shuai
    Shi, Jiatong
    Qian, Tao
    Watanabe, Shinji
    Jin, Qin
    [J]. INTERSPEECH 2022, 2022, : 4272 - 4276
  • [2] CycleGAN-VC: Non-parallel Voice Conversion Using Cycle-Consistent Adversarial Networks
    Kaneko, Takuhiro
    Kameoka, Hirokazu
    [J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2100 - 2104
  • [3] SINGING VOICE CONVERSION WITH NON-PARALLEL DATA
    Chen, Xin
    Chu, Wei
    Guo, Jinxi
    Xu, Ning
    [J]. 2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 292 - 296
  • [4] Non-Parallel Voice Conversion Using Cycle-Consistent Adversarial Networks with Self-Supervised Representations
    Chun, Chanjun
    Lee, Young Han
    Lee, Geon Woo
    Jeon, Moongu
    Kim, Hong Kook
    [J]. 2023 IEEE 20TH CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE, CCNC, 2023,
  • [5] HIGH-QUALITY NONPARALLEL VOICE CONVERSION BASED ON CYCLE-CONSISTENT ADVERSARIAL NETWORK
    Fang, Fuming
    Yamagishi, Junichi
    Echizen, Isao
    Lorenzo-Trueba, Jaime
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5279 - 5283
  • [6] MANY-TO-MANY VOICE CONVERSION USING CONDITIONAL CYCLE-CONSISTENT ADVERSARIAL NETWORKS
    Lee, Shindong
    Ko, BongGu
    Lee, Keonnyeong
    Yoo, In-Chul
    Yook, Dongsuk
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6279 - 6283
  • [7] CYCLE-CONSISTENT ADVERSARIAL NETWORKS FOR NON-PARALLEL VOCAL EFFORT BASED SPEAKING STYLE CONVERSION
    Seshadri, Shreyas
    Juvela, Lauri
    Yamagishi, Junichi
    Rasanen, Okko
    Alku, Paavo
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6835 - 6839
  • [8] VAW-GAN for Singing Voice Conversion with Non-parallel Training Data
    Lu, Junchen
    Zhou, Kun
    Sisman, Berrak
    Li, Haizhou
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 514 - 519
  • [9] A Parallel-Data-Free Speech Enhancement Method Using Multi-Objective Learning Cycle-Consistent Generative Adversarial Network
    Xiang, Yang
    Bao, Changchun
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1826 - 1838
  • [10] ADVERSARIALLY TRAINED AUTOENCODERS FOR PARALLEL-DATA-FREE VOICE CONVERSION
    Ocal, Orhan
    Elibol, Oguz H.
    Keskin, Gokce
    Stephenson, Cory
    Thomas, Anil
    Ramchandran, Kannan
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2777 - 2781