Global Variance Modeling on the Log Power Spectrum of LSPs for HMM-based Speech Synthesis

被引:0
|
作者
Ling, Zhen-Hua [1 ]
Hu, Yu [1 ]
Dai, Li-Rong [1 ]
机构
[1] Univ Sci & Technol China, iFLYTEK Speech Lab, Hefei, Peoples R China
关键词
speech synthesis; hidden Markov model; global variance; power spectrum;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a method to model the global variance (GV) of log power spectrums derived from the line spectral pairs (LSPs) in a sentence for HMM-based parametric speech synthesis. Different from the conventional GV method where the observations for GV model training are the variances of spectral parameters for each training sentence, our proposed method directly models the temporal variances of each frequency point in the spectral envelope reconstructed using LSPs. At synthesis stage, the likelihood function of trained GV model is integrated into the maximum likelihood parameter generation algorithm to alleviate the over-smoothing effect on the generated spectral structures. Experiment results show that the proposed method can outperform the conventional GV method when LSPs are used as the spectral parameters and improve the naturalness of synthetic speech significantly.
引用
收藏
页码:825 / 828
页数:4
相关论文
共 50 条
  • [41] Pitch-Scaled Spectrum Based Excitation Model for HMM-based Speech Synthesis
    Zhengqi Wen
    Jianhua Tao
    Shifeng Pan
    Yang Wang
    [J]. Journal of Signal Processing Systems, 2014, 74 : 423 - 435
  • [42] Speech parameter generation algorithms for HMM-based speech synthesis
    Tokuda, K
    Yoshimura, T
    Masuko, T
    Kobayashi, T
    Kitamura, T
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1315 - 1318
  • [43] A Hierarchical F0 Modeling Method for HMM-based Speech Synthesis
    Lei, Ming
    Wu, Yi-Jian
    Soong, Frank K.
    Ling, Zhen-Hua
    Dai, Li-Rong
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2170 - +
  • [44] Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis
    Pucher, Michael
    Schabus, Dietmar
    Yamagishi, Junichi
    Neubarth, Friedrich
    Strom, Volker
    [J]. SPEECH COMMUNICATION, 2010, 52 (02) : 164 - 179
  • [45] HMM-based speech enhancement using harmonic modeling
    Deisher, ME
    Spanias, AS
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1175 - 1178
  • [46] HMM-based gain modeling for enhancement of speech in noise
    Zhao, David Y.
    Kleijn, W. Bastiaan
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 882 - 892
  • [47] Acoustic modeling of speaking styles and emotional expressions in HMM-based speech synthesis
    Yamagishi, J
    Onishi, K
    Masuko, T
    Kobayashi, T
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03): : 502 - 509
  • [48] AN OPTIMIZATION ALGORITHM OF INDEPENDENT MEAN AND VARIANCE PARAMETER TYING STRUCTURES FOR HMM-BASED SPEECH SYNTHESIS
    Takaki, Shinji
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4700 - 4703
  • [49] HMM-based speech synthesis using sub-band basis spectrum model
    Ohtani, Yamato
    Tamura, Masatsune
    Morita, Masahiro
    Kagoshima, Takehiko
    Akamine, Masami
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1438 - 1441
  • [50] Modified Post-filter to Recover Modulation Spectrum for HMM-based Speech Synthesis
    Takamichi, Shinnosuke
    Toda, Tomoki
    Black, Alan W.
    Nakamura, Satoshi
    [J]. 2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 547 - 551