ADAPTIVE WAVENET VOCODER FOR RESIDUAL COMPENSATION IN GAN-BASED VOICE CONVERSION

被引:0
|
作者
Sisman, Berrak [1 ,2 ,3 ]
Zhang, Mingyang [1 ]
Sakti, Sakriani [2 ,3 ]
Li, Haizhou [1 ]
Nakamura, Satoshi [2 ,3 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Nara Inst Sci & Technol, Nara, Japan
[3] RIKEN, Ctr Adv Intelligence Project AIP, Tokyo, Japan
关键词
voice conversion; generative adversarial networks; adaptive Wavenet; residual compensation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose to use generative adversarial networks (GAN) together with a WaveNet vocoder to address the over-smoothing problem arising from the deep learning approaches to voice conversion, and to improve the vocoding quality over the traditional vocoders. As GAN aims to minimize the divergence between the natural and converted speech parameters, it effectively alleviates the over-smoothing problem in the converted speech. On the other hand, WaveNet vocoder allows us to leverage from the human speech of a large speaker population, thus improving the naturalness of the synthetic voice. Furthermore, for the first time, we study how to use WaveNet vocoder for residual compensation to improve the voice conversion performance. The experiments show that the proposed voice conversion framework consistently outperforms the baselines.
引用
收藏
页码:282 / 289
页数:8
相关论文
共 50 条
  • [31] GAN-based adaptive cost learning for enhanced image steganography security
    Wang, Dewang
    Yang, Gaobo
    Chen, Jiyou
    Ding, Xiangling
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [32] Effective Color Conversion of GaN-Based LEDs via Coated Phosphor Layers
    Lee, Hsin-Ying
    Lin, Yu-Chang
    Chen, I-Hsing
    Chao, Chia-Hsin
    [J]. IEEE PHOTONICS TECHNOLOGY LETTERS, 2013, 25 (08) : 764 - 767
  • [33] Polarity-dependent emission and conversion characteristics of GaN-based thermionic cathodes
    Kimura, Shigeya
    Yoshida, Hisashi
    Miyazaki, Hisao
    Ito, Takeshi
    Ogino, Akihisa
    [J]. 2020 33RD INTERNATIONAL VACUUM NANOELECTRONICS CONFERENCE (IVNC), 2018, : 31 - 32
  • [34] Wasserstein GAN and Waveform Loss-Based Acoustic Model Training for Multi-Speaker Text-to-Speech Synthecis Systems Using a WaveNet Vocoder
    Zhao, Yi
    Takaki, Shinji
    Luong, Hieu-Thi
    Yamagishi, Junichi
    Saito, Daisuke
    Minematsu, Nobuaki
    [J]. IEEE ACCESS, 2018, 6 : 60478 - 60488
  • [35] GaN-Based Light-Emitting Diodes With AlGaN Strain Compensation Buffer Layer
    Chang, Shoou-Jinn
    Lu, Lucent
    Lin, Yu-Yao
    Li, Shuguang
    [J]. JOURNAL OF DISPLAY TECHNOLOGY, 2013, 9 (11): : 910 - 914
  • [36] Adaptive Bias-Sequencing Circuit for GaN-based RF Power Amplifiers
    Qazi, Mudassar A.
    Sarmad, Syed Muhammad
    Cheema, Hammad M.
    [J]. 2021 1ST INTERNATIONAL CONFERENCE ON MICROWAVE, ANTENNAS & CIRCUITS (ICMAC), 2021,
  • [37] Enhancement of the conversion efficiency of GaN-based photovoltaic devices with AlGaN/InGaN absorption layers
    Yang, C. C.
    Sheu, J. K.
    Liang, Xin-Wei
    Huang, Min-Shun
    Lee, M. L.
    Chang, K. H.
    Tu, S. J.
    Huang, Feng-Wen
    Lai, W. C.
    [J]. APPLIED PHYSICS LETTERS, 2010, 97 (02)
  • [38] A Novel Dead-time Adaptive Control Method for GaN-based Motor Drive
    Qin, Haihong
    Wang, Wenlu
    Xie, Sixuan
    Peng, Jiangjin
    Chen, Wenming
    [J]. Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering, 2023, 43 (11): : 4422 - 4433
  • [39] VideoTrain plus plus : GAN-based adaptive framework for synthetic video traffic generation
    Madarasingha, Chamara
    Muramudalige, Shashika R.
    Jourjon, Guillaume
    Jayasumana, Anura
    Thilakarathna, Kanchana
    [J]. COMPUTER NETWORKS, 2022, 206
  • [40] Conditional convolutional GAN-based adaptive demodulator for OAM-SK-FSO communication
    Han, Zheng
    Chen, Xiao
    Wang, Yiquan
    Cai, Yuanyuan
    [J]. OPTICS EXPRESS, 2024, 32 (07) : 11629 - 11642