Speaker-dependent WaveNet vocoder

被引:209
|
作者
Tamamori, Akira [1 ]
Hayashi, Tomoki [2 ]
Kobayashi, Kazuhiro [3 ]
Takeda, Kazuya [2 ]
Toda, Tomoki [3 ]
机构
[1] Nagoya Univ, Inst Innovat Future Soc, Nagoya, Aichi, Japan
[2] Nagoya Univ, Grad Sch Informat Sci, Nagoya, Aichi, Japan
[3] Nagoya Univ, Informat Technol Ctr, Nagoya, Aichi, Japan
基金
日本科学技术振兴机构;
关键词
WaveNet; convolutional neural network; vocoder; deep neural network; SPEECH SYNTHESIS SYSTEM; REPRESENTATION; SELECTION; SPECTRUM;
D O I
10.21437/Interspeech.2017-314
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we propose a speaker-dependent WaveNet vocoder, a method of synthesizing speech waveforms with WaveNet, by utilizing acoustic features from existing vocoder as auxiliary features of WaveNet. It is expected that WaveNet can learn a sample-by-sample correspondence between speech waveform and acoustic features. The advantage of the proposed method is that it does not require (1) explicit modeling of excitation signals and (2) various assumptions. which are based on prior knowledge specific to speech. We conducted both subjective and objective evaluation experiments on CMU-ARCTIC database. From the results of the objective evaluation, it was demonstrated that the proposed method could generate high-quality speech with phase information recovered, which was lost by a mel-cepstrum vocoder. From the results of the subjective evaluation, it was demonstrated that the sound quality of the proposed method was significantly improved from mel-cepstrum vocoder, and the proposed method could capture source excitation information more accurately.
引用
收藏
页码:1118 / 1122
页数:5
相关论文
共 50 条
  • [1] FFTNET: A REAL-TIME SPEAKER-DEPENDENT NEURAL VOCODER
    Jin, Zeyu
    Finkelstein, Adam
    Mysore, Gautham J.
    Lu, Jingwan
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2251 - 2255
  • [2] A Speaker-Dependent WaveNet for Voice Conversion with Non-Parallel Data
    Tian, Xiaohai
    Chng, Eng Siong
    Li, Haizhou
    [J]. INTERSPEECH 2019, 2019, : 201 - 205
  • [3] SPEAKER-DEPENDENT WAVENET-BASED DELAY-FREE ADPCM SPEECH CODING
    Yoshimura, Takenori
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7145 - 7149
  • [4] Relational Data Selection for Data Augmentation of Speaker-dependent Multi-band MelGAN Vocoder
    Wu, Yi-Chiao
    Hu, Cheng-Hung
    Lee, Hung-Shin
    Peng, Yu-Huai
    Huang, Wen-Chin
    Tsao, Yu
    Wang, Hsin-Min
    Toda, Tomoki
    [J]. INTERSPEECH 2021, 2021, : 3630 - 3634
  • [5] Speaker-dependent characteristics of the nasals
    Amino, Kanae
    Arai, Takayuki
    [J]. FORENSIC SCIENCE INTERNATIONAL, 2009, 185 (1-3) : 21 - 28
  • [6] Gender-dependent and speaker-dependent speech enhancement
    Potamitis, I
    Fakotakis, N
    Kokkinakis, G
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 249 - 252
  • [7] RECOGNITION OF SPEAKER-DEPENDENT CONTINUOUS SPEECH WITH KEAL
    MERCIER, G
    BIGORGNE, D
    MICLET, L
    LEGUENNEC, L
    QUERRE, M
    [J]. IEE PROCEEDINGS-I COMMUNICATIONS SPEECH AND VISION, 1989, 136 (02): : 145 - 154
  • [8] Emotional Speech Synthesis for Multi-Speaker Emotional Dataset Using WaveNet Vocoder
    Choi, Heejin
    Park, Sangjun
    Park, Jinuk
    Hahn, Minsoo
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2019,
  • [9] ON THE USE OF WAVENET AS A STATISTICAL VOCODER
    Adiga, Nagaraj
    Tsiaras, Vassilis
    Stylianou, Yannis
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5674 - 5678
  • [10] Effectiveness of speaker-dependent feature score pruning in speaker verification
    Pillay, Surosh G.
    Ariyaeeinia, Aladdin
    Pawlewski, Mark
    [J]. 2008 3RD INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS, CONTROL AND SIGNAL PROCESSING, VOLS 1-3, 2008, : 372 - +