Global Variance Modeling on the Log Power Spectrum of LSPs for HMM-based Speech Synthesis

被引:0
|
作者
Ling, Zhen-Hua [1 ]
Hu, Yu [1 ]
Dai, Li-Rong [1 ]
机构
[1] Univ Sci & Technol China, iFLYTEK Speech Lab, Hefei, Peoples R China
关键词
speech synthesis; hidden Markov model; global variance; power spectrum;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a method to model the global variance (GV) of log power spectrums derived from the line spectral pairs (LSPs) in a sentence for HMM-based parametric speech synthesis. Different from the conventional GV method where the observations for GV model training are the variances of spectral parameters for each training sentence, our proposed method directly models the temporal variances of each frequency point in the spectral envelope reconstructed using LSPs. At synthesis stage, the likelihood function of trained GV model is integrated into the maximum likelihood parameter generation algorithm to alleviate the over-smoothing effect on the generated spectral structures. Experiment results show that the proposed method can outperform the conventional GV method when LSPs are used as the spectral parameters and improve the naturalness of synthetic speech significantly.
引用
收藏
页码:825 / 828
页数:4
相关论文
共 50 条
  • [1] Integrating Global Variance of Log Power Spectrum Derived from LSPs into MGE Training for HMM-Based Parametric Speech Synthesis
    Sun, Yu-Sheng
    Ling, Zhen-Hua
    Yin, Xiang
    Dai, Li-Rong
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 201 - 205
  • [2] Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis
    Yin, Xiang
    Ling, Zhen-Hua
    Lei, Ming
    Dai, Li-Rong
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1146 - 1149
  • [3] GLOBAL VARIANCE MODELING ON FREQUENCY DOMAIN DELTA LSP FOR HMM-BASED SPEECH SYNTHESIS
    Pan, Shifeng
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    Tao, Jianhua
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4716 - 4719
  • [4] Minimum generation error training with direct log spectral distortion on LSPs for HMM-based speech synthesis
    Wu, Yi-Jian
    Tokuda, Keiichi
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 577 - 580
  • [5] TRAJECTORY TRAINING CONSIDERING GLOBAL VARIANCE FOR HMM-BASED SPEECH SYNTHESIS
    Toda, Tomoki
    Young, Steve
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4025 - +
  • [6] A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    Toda, Tomoki
    Tokuda, Keiichi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (05): : 816 - 824
  • [7] Asynchronous F0 and Spectrum Modeling for HMM-Based Speech Synthesis
    Wang, Cheng-Cheng
    Ling, Zhen-Hua
    Dai, Li-Rong
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 412 - 415
  • [8] State duration modeling for HMM-based speech synthesis
    Zen, Heiga
    Masuko, Takashi
    Tokuda, Keiichi
    Yoshimura, Takayoshi
    Kobayasih, Takao
    Kitamura, Tadashi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
  • [9] Minimum generation error criterion considering global/local variance for HMM-based speech synthesis
    Wu, Yi-Jian
    Zen, Heiga
    Nankaku, Yoshilliko
    Tokuda, Keiichi
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4621 - 4624
  • [10] A POSTFILTER TO MODIFY THE MODULATION SPECTRUM IN HMM-BASED SPEECH SYNTHESIS
    Takamichi, Shinnosuke
    Toda, Tomoki
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,