Parallel data free singing voice conversion with cycle-consistent BEGAN

被引:1
|
作者
Yousuf, Assila [1 ]
George, David Solomon [2 ]
机构
[1] Rajiv Gandhi Inst Technol, Dept Elect & Commun Engn, Kottayam, Kerala, India
[2] Govt Engn Coll, Dept Elect & Commun Engn, Idukki, Kerala, India
关键词
Singing voice conversion; GAN; Gated CNN; CycleGAN; BEGAN;
D O I
10.1016/j.matpr.2022.01.169
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Singing voice conversion (SVC) is the method to modify the timbre of the source singer with the target singer while retaining the linguistic content. Recent studies are mainly focused with non-parallel training data, since it is difficult to obtain parallel training data in real life applications. In this paper, a parallel data-free SVC technique is proposed using Cycle-consistent Boundary Equilibrium Generative Adversarial Networks (CycleBEGAN) with gated convolutional neural networks (CNNs) and an identitymapping loss. CycleBEGAN allows the learning of data distribution using both adversarial loss and cycle-consistency loss. Gated CNN and identity mapping loss ensures the sequential and hierarchical structures of information and preservation of linguistic information. This technique produces high quality converted singing voice without any time-alignment procedures and requires only a small amount of training data. Copyright (C) 2022 Elsevier Ltd. All rights reserved. Selection and peer-review under responsibility of the scientific committee of the International Conference on Artificial Intelligence & Energy Systems.
引用
收藏
页码:157 / 161
页数:5
相关论文
共 50 条
  • [41] A Speaker-Dependent WaveNet for Voice Conversion with Non-Parallel Data
    Tian, Xiaohai
    Chng, Eng Siong
    Li, Haizhou
    [J]. INTERSPEECH 2019, 2019, : 201 - 205
  • [42] sciCAN: single-cell chromatin accessibility and gene expression data integration via cycle-consistent adversarial network
    Yang Xu
    Edmon Begoli
    Rachel Patton McCord
    [J]. npj Systems Biology and Applications, 8
  • [43] Parallel-data-free Many-to-many Voice Conversion based on DNN Integrated with Eigenspace Using a Non-parallel Speech Corpus
    Hashimoto, Tetsuya
    Uchida, Hidetsugu
    Saito, Daisuke
    Minematsu, Nobuaki
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1278 - 1282
  • [44] ARVC: An Auto-Regressive Voice Conversion System Without Parallel Training Data
    Lian, Zheng
    Wen, Zhengqi
    Zhou, Xinyong
    Pu, Songbai
    Zhang, Shengkai
    Tao, Jianhua
    [J]. INTERSPEECH 2020, 2020, : 4706 - 4710
  • [45] PHONETIC POSTERIORGRAMS FOR MANY-TO-ONE VOICE CONVERSION WITHOUT PARALLEL DATA TRAINING
    Sun, Lifa
    Li, Kun
    Wang, Hao
    Kang, Shiyin
    Meng, Helen
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2016,
  • [46] CLOUD REMOVAL IN UNPAIRED SENTINEL-2 IMAGERY USING CYCLE-CONSISTENT GAN AND SAR-OPTICAL DATA FUSION
    Ebel, Patrick
    Schmitt, Michael
    Zhu, Xiao Xiang
    [J]. IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 2065 - 2068
  • [47] An Attention-Based Cycle-Consistent Generative Adversarial Network for IoT Data Generation and Its Application in Smart Energy Systems
    Ma, Zhengjing
    Mei, Gang
    Piccialli, Francesco
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (04) : 6170 - 6181
  • [48] Synthesizing high-resolution magnetic resonance imaging using parallel cycle-consistent generative adversarial networks for fast magnetic resonance imaging
    Xie, Huiqiao
    Lei, Yang
    Wang, Tonghe
    Roper, Justin
    Dhabaan, Anees H.
    Bradley, Jeffrey D.
    Liu, Tian
    Mao, Hui
    Yang, Xiaofeng
    [J]. MEDICAL PHYSICS, 2022, 49 (01) : 357 - 369
  • [49] F0-CONSISTENT MANY-TO-MANY NON-PARALLEL VOICE CONVERSION VIA CONDITIONAL AUTOENCODER
    Qian, Kaizhi
    Fin, Zeyu
    Hasegawa-Johnson, Mark
    Mysore, Gautham J.
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6284 - 6288
  • [50] Data Augmentation via Mixed Class Interpolation using Cycle-Consistent Generative Adversarial Networks Applied to Cross-Domain Imagery
    Sasaki, Hiroshi
    Willcocks, Chris G.
    Breckon, Toby P.
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5083 - 5090