EFFECTIVE WAVENET ADAPTATION FOR VOICE CONVERSION WITH LIMITED DATA

被引:0
|
作者
Du, Hongqiang [1 ,2 ]
Tian, Xiaohai [2 ]
Xie, Lei [1 ]
Li, Haizhou [2 ,3 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Audio Speech & Langauge Proc Lab, Xian, Peoples R China
[2] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[3] Univ Bremen, Machine Listening Lab, Bremen, Germany
基金
新加坡国家研究基金会;
关键词
Voice Conversion (VC); WaveNet adaptation; Singular Value Decomposition (SVD);
D O I
10.1109/icassp40776.2020.9053315
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
WaveNet has shown its great potential as a direct conversion model in voice conversion. However, due to the model complexity, WaveNet always requires a large amount of training data, which has limited its applications in voice conversion, where training data is scarce. In this paper, we propose a WaveNet adaptation method that effectively reduces the need of adaptation data. We first train a speaker independent WaveNet conversion model with multi-speaker dataset. Adaptation is then applied with limited target speaker's data. Specifically, singular value decomposition (SVD) is applied to dilated convolution layers of WaveNet to reduce the number of parameters, which makes adaptation more effective with limited data. Experiments conducted on CMU-ARCTIC and CSTR-VCTK corpus show that the proposed method outperforms baseline methods in terms of both quality and similarity.
引用
收藏
页码:7779 / 7783
页数:5
相关论文
共 50 条
  • [1] Factorized WaveNet for voice conversion with limited data
    Du, Hongqiang
    Tian, Xiaohai
    Xie, Lei
    Li, Haizhou
    [J]. SPEECH COMMUNICATION, 2021, 130 : 45 - 54
  • [2] WaveNet Vocoder with Limited Training Data for Voice Conversion
    Liu, Li-Juan
    Ling, Zhen-Hua
    Yuan-Jiang
    Ming-Zhou
    Dai, Li-Rong
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1983 - 1987
  • [3] STATISTICAL VOICE CONVERSION BASED ON WAVENET
    Niwa, Jumpei
    Yoshimura, Takenori
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5289 - 5293
  • [4] A Speaker-Dependent WaveNet for Voice Conversion with Non-Parallel Data
    Tian, Xiaohai
    Chng, Eng Siong
    Li, Haizhou
    [J]. INTERSPEECH 2019, 2019, : 201 - 205
  • [5] WAVENET FACTORIZATION WITH SINGULAR VALUE DECOMPOSITION FOR VOICE CONVERSION
    Du, Hongqiang
    Tian, Xiaohai
    Xie, Lei
    Li, Haizhou
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 152 - 159
  • [6] DeepConversion: Voice conversion with limited parallel training data
    Zhang, Mingyang
    Sisman, Berrak
    Zhao, Li
    Li, Haizhou
    [J]. SPEECH COMMUNICATION, 2020, 122 : 31 - 43
  • [7] Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion
    Huang, Wen-Chin
    Wu, Yi-Chiao
    Hwang, Hsin-Te
    Tobing, Patrick Lumban
    Hayashi, Tomoki
    Kobayashi, Kazuhiro
    Toda, Tomoki
    Tsao, Yu
    Wang, Hsin-Min
    [J]. 2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [8] Statistical voice conversion with WaveNet-based waveform generation
    Kobayashi, Kazuhiro
    Hayashi, Tomoki
    Tamamori, Akira
    Toda, Tomoki
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1138 - 1142
  • [9] AN EVALUATION OF DEEP SPECTRAL MAPPINGS AND WAVENET VOCODER FOR VOICE CONVERSION
    Tobing, Patrick Lumban
    Hayashi, Tomoki
    Wu, Yi-Chiao
    Kobayashi, Kazuhiro
    Toda, Tomoki
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 297 - 303
  • [10] ATTENTION-BASED WAVENET AUTOENCODER FOR UNIVERSAL VOICE CONVERSION
    Polyak, Adam
    Wolf, Lior
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6800 - 6804